1
|
Lu XY, Wu HP, Ma H, Li H, Li J, Liu YT, Pan ZY, Xie Y, Wang L, Ren B, Liu GK. Deep Learning-Assisted Spectrum-Structure Correlation: State-of-the-Art and Perspectives. Anal Chem 2024; 96:7959-7975. [PMID: 38662943 DOI: 10.1021/acs.analchem.4c01639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Spectrum-structure correlation is playing an increasingly crucial role in spectral analysis and has undergone significant development in recent decades. With the advancement of spectrometers, the high-throughput detection triggers the explosive growth of spectral data, and the research extension from small molecules to biomolecules accompanies massive chemical space. Facing the evolving landscape of spectrum-structure correlation, conventional chemometrics becomes ill-equipped, and deep learning assisted chemometrics rapidly emerges as a flourishing approach with superior ability of extracting latent features and making precise predictions. In this review, the molecular and spectral representations and fundamental knowledge of deep learning are first introduced. We then summarize the development of how deep learning assist to establish the correlation between spectrum and molecular structure in the recent 5 years, by empowering spectral prediction (i.e., forward structure-spectrum correlation) and further enabling library matching and de novo molecular generation (i.e., inverse spectrum-structure correlation). Finally, we highlight the most important open issues persisted with corresponding potential solutions. With the fast development of deep learning, it is expected to see ultimate solution of establishing spectrum-structure correlation soon, which would trigger substantial development of various disciplines.
Collapse
Affiliation(s)
- Xin-Yu Lu
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Tan Kah Kee Innovation Laboratory, Xiamen 361005, P. R. China
| | - Hao-Ping Wu
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, P. R. China
| | - Hao Ma
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Tan Kah Kee Innovation Laboratory, Xiamen 361005, P. R. China
| | - Hui Li
- Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, Xiamen 361005, P. R. China
| | - Jia Li
- Institute of Artificial Intelligence, Xiamen University, Xiamen 361005, P. R. China
| | - Yan-Ti Liu
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Tan Kah Kee Innovation Laboratory, Xiamen 361005, P. R. China
| | - Zheng-Yan Pan
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
| | - Yi Xie
- School of Informatics, Xiamen University, Xiamen 361005, P. R. China
| | - Lei Wang
- Pen-Tung Sah Institute of Micro-Nano Science and Technology, Xiamen University, Xiamen 361005, P. R. China
| | - Bin Ren
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China
- Tan Kah Kee Innovation Laboratory, Xiamen 361005, P. R. China
| | - Guo-Kun Liu
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, P. R. China
| |
Collapse
|
2
|
Duan S, Tian G, Luo Y. Theoretical and computational methods for tip- and surface-enhanced Raman scattering. Chem Soc Rev 2024; 53:5083-5117. [PMID: 38596836 DOI: 10.1039/d3cs01070h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/11/2024]
Abstract
Raman spectroscopy is a versatile tool for acquiring molecular structure information. The incorporation of plasmonic fields has significantly enhanced the sensitivity and resolution of surface-enhanced Raman scattering (SERS) and tip-enhanced Raman spectroscopy (TERS). The strong spatial confinement effect of plasmonic fields has challenged the conventional Raman theory, in which a plane wave approximation for the light has been adopted. In this review, we comprehensively survey the progress of a generalized theory for SERS and TERS in the framework of effective field Hamiltonian (EFH). With this approach, all characteristics of localized plasmonic fields can be well taken into account. By employing EFH, quantitative simulations at the first-principles level for state-of-the-art experimental observations have been achieved, revealing the underlying intrinsic physics in the measurements. The predictive power of EFH is demonstrated by several new phenomena generated from the intrinsic spatial, momentum, time, and energy structures of the localized plasmonic field. The corresponding experimental verifications are also carried out briefly. A comprehensive computational package for modeling of SERS and TERS at the first-principles level is introduced. Finally, we provide an outlook on the future developments of theory and experiments for SERS and TERS.
Collapse
Affiliation(s)
- Sai Duan
- Collaborative Innovation Center of Chemistry for Energy Materials, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, MOE Key Laboratory of Computational Physical Sciences, Department of Chemistry, Fudan University, Shanghai 200433, China.
| | - Guangjun Tian
- State Key Laboratory of Metastable Materials Science & Technology and Key Laboratory for Microstructural Material Physics of Hebei Province, School of Science, Yanshan University, Qinhuangdao 066004, China
| | - Yi Luo
- Hefei National Research Center for Physical Science at the Microscale and Synergetic Innovation Center of Quantum Information & Quantum Physics, University of Science and Technology of China, Hefei, Anhui 230026, China.
- Hefei National Laboratory, University of Science and Technology of China, Hefei, 230088, China
| |
Collapse
|
3
|
Ma B, Chen H, Gong J, Liu W, Wei X, Zhang Y, Li X, Li M, Wang Y, Shang S, Tian B, Li Y, Wang R, Tan Z. Enhancing Protein Solubility via Glycosylation: From Chemical Synthesis to Machine Learning Predictions. Biomacromolecules 2024; 25:3001-3010. [PMID: 38598264 DOI: 10.1021/acs.biomac.4c00134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/11/2024]
Abstract
Glycosylation is a valuable tool for modulating protein solubility; however, the lack of reliable research strategies has impeded efficient progress in understanding and applying this modification. This study aimed to bridge this gap by investigating the solubility of a model glycoprotein molecule, the carbohydrate-binding module (CBM), through a two-stage process. In the first stage, an approach involving chemical synthesis, comparative analysis, and molecular dynamics simulations of a library of glycoforms was employed to elucidate the effect of different glycosylation patterns on solubility and the key factors responsible for the effect. In the second stage, a predictive mathematical formula, innovatively harnessing machine learning algorithms, was derived to relate solubility to the identified key factors and accurately predict the solubility of the newly designed glycoforms. Demonstrating feasibility and effectiveness, this two-stage approach offers a valuable strategy for advancing glycosylation research, especially for the discovery of glycoforms with increased solubility.
Collapse
Affiliation(s)
- Bo Ma
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, China
| | - Hedi Chen
- School of Pharmaceutical Sciences, Tsinghua University, Beijing 100084, China
| | - Jinyuan Gong
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, China
| | - Wenqiang Liu
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, China
| | - Xiuli Wei
- Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yajing Zhang
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, China
| | - Xin Li
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, China
| | - Meng Li
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, China
| | - Yani Wang
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, China
| | - Shiying Shang
- Center of Pharmaceutical Technology, School of Pharmaceutical Sciences, Tsinghua University, Beijing 100084, China
| | - Boxue Tian
- School of Pharmaceutical Sciences, Tsinghua University, Beijing 100084, China
| | - Yaohao Li
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, China
| | - Ruihan Wang
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, China
- Chemical Engineering College, Hebei Normal University of Science and Technology, Qinhuangdao 066600, China
| | - Zhongping Tan
- State Key Laboratory of Bioactive Substance and Function of Natural Medicines, Institute of Materia Medica, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, China
| |
Collapse
|
4
|
Jiang S, Wang X, Chong Y, Huang Y, Hu W, Smith PES, Jiang J, Feng S. Spectra-Based Machine Learning for Predicting the Statistical Interaction Properties of CO Adsorbates on Surface. J Phys Chem Lett 2024; 15:2400-2404. [PMID: 38393989 DOI: 10.1021/acs.jpclett.4c00011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2024]
Abstract
Theoretical analyses of small-molecule adsorption on heterogeneous catalyst surfaces often rely on simplified models of molecular adsorption with the most favorable configuration. Given that real-world experimental tests frequently entail multiple molecules interacting with the surface, there is a pressing need for a comprehensive multimolecule adsorption model to bridge the gap between theory and experiment. Using machine learning, we predict the average values of important adsorption properties from conformationally averaged, calculated infrared and Raman spectra and compare these values to those theoretically derived from the conformationally averaged ensemble. Remarkably, our approach yields excellent predictions even when faced with large and indeterminate numbers of surface molecules. These quantitative spectra-averaged property relationships provide a theoretical framework for extracting key interaction properties from the spectra of real chemical environments.
Collapse
Affiliation(s)
- Shuang Jiang
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei 230026, China
| | - Xijun Wang
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei 230026, China
| | - Yuanyuan Chong
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei 230026, China
| | - Yan Huang
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei 230026, China
| | - Wei Hu
- School of Chemistry and Chemical Engineering, Qilu University of Technology (Shandong Academy of Science), Jinan 250353, China
| | | | - Jun Jiang
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei 230026, China
| | - Shuo Feng
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei 230026, China
| |
Collapse
|
5
|
Usoltsev O, Tereshchenko A, Skorynina A, Kozyr E, Soldatov A, Safonova O, Clark AH, Ferri D, Nachtegaal M, Bugaev A. Machine Learning for Quantitative Structural Information from Infrared Spectra: The Case of Palladium Hydride. SMALL METHODS 2024:e2301397. [PMID: 38295064 DOI: 10.1002/smtd.202301397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 01/09/2024] [Indexed: 02/02/2024]
Abstract
Infrared spectroscopy (IR) is a widely used technique enabling to identify specific functional groups in the molecule of interest based on their characteristic vibrational modes or the presence of a specific adsorption site based on the characteristic vibrational mode of an adsorbed probe molecule. The interpretation of an IR spectrum is generally carried out within a fingerprint paradigm by comparing the observed spectral features with the features of known references or theoretical calculations. This work demonstrates a method for extracting quantitative structural information beyond this approach by application of machine learning (ML) algorithms. Taking palladium hydride formation as an example, Pd-H pressure-composition isotherms are reconstructed using IR data collected in situ in diffuse reflectance using CO molecule as a probe. To the best of the knowledge, this is the first example of the determination of continuous structural descriptors (such as interatomic distance and stoichiometric coefficient) from the fine structure of vibrational spectra, which opens new possibilities of using IR spectra for structural analysis.
Collapse
Affiliation(s)
- Oleg Usoltsev
- ALBA Synchrotron, Cerdanyola del Valles, Barcelona, 08290, Spain
| | | | - Alina Skorynina
- ALBA Synchrotron, Cerdanyola del Valles, Barcelona, 08290, Spain
| | | | - Alexander Soldatov
- Southern Federal University, Sladkova 178/24, Rostov-on-Don, 344090, Russia
| | - Olga Safonova
- Paul Scherrer Institute, Forschungsstrasse 111, Villigen, 5232, Switzerland
| | - Adam H Clark
- Paul Scherrer Institute, Forschungsstrasse 111, Villigen, 5232, Switzerland
| | - Davide Ferri
- Paul Scherrer Institute, Forschungsstrasse 111, Villigen, 5232, Switzerland
| | - Maarten Nachtegaal
- Paul Scherrer Institute, Forschungsstrasse 111, Villigen, 5232, Switzerland
| | - Aram Bugaev
- Paul Scherrer Institute, Forschungsstrasse 111, Villigen, 5232, Switzerland
| |
Collapse
|
6
|
Du W, Ma F, Zhang B, Zhang J, Wu D, Sharman E, Jiang J, Wang Y. Spectroscopy-Guided Deep Learning Predicts Solid-Liquid Surface Adsorbate Properties in Unseen Solvents. J Am Chem Soc 2024; 146:811-823. [PMID: 38157302 PMCID: PMC10785802 DOI: 10.1021/jacs.3c10921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 12/13/2023] [Accepted: 12/14/2023] [Indexed: 01/03/2024]
Abstract
Accurately and rapidly acquiring the microscopic properties of a material is crucial for catalysis and electrochemistry. Characterization tools, such as spectroscopy, can be a valuable tool to infer these properties, and when combined with machine learning tools, they can theoretically achieve fast and accurate prediction results. However, on the path to practical applications, training a reliable machine learning model is faced with the challenge of uneven data distribution in a vast array of non-negligible solvent types. Herein, we employ a combination of the first-principles-based approach and data-driven model. Specifically, we utilize density functional theory (DFT) to calculate theoretical spectral data of CO-Ag adsorption in 23 different solvent systems as a data source. Subsequently, we propose a hierarchical knowledge extraction multiexpert neural network (HMNN) to bridge the knowledge gaps among different solvent systems. HMNN undergoes two training tiers: in tier I, it learns fundamental quantitative spectra-property relationships (QSPRs), and in tier II, it inherits the fundamental QSPR knowledge from previous steps through a dynamic integration of expert modules and subsequently captures the solvent differences. The results demonstrate HMNN's superiority in estimating a range of molecular adsorbate properties, with an error range of less than 0.008 eV for zero-shot predictions on unseen solvents. The findings underscore the usability, reliability, and convenience of HMNN and could pave the way for real-time access to microscopic properties by exploiting QSPR.
Collapse
Affiliation(s)
- Wenjie Du
- Key
Laboratory of Precision and Intelligent Chemistry, University of Science and Technology of China, Hefei, Anhui 230026, China
- School
of Software Engineering, University of Science
and Technology of China, Hefei, Anhui 230026, China
- Suzhou
Institute for Advanced Research, University
of Science and Technology of China, Suzhou, Jiangsu 215123, China
| | - Fenfen Ma
- Key
Laboratory of Precision and Intelligent Chemistry, University of Science and Technology of China, Hefei, Anhui 230026, China
- School
of Chemistry and Materials Science, University
of Science and Technology of China, Hefei, Anhui 230026, China
- Gusu
Laboratory of Materials, Suzhou, Jiangsu 215123, China
| | - Baicheng Zhang
- Key
Laboratory of Precision and Intelligent Chemistry, University of Science and Technology of China, Hefei, Anhui 230026, China
- School
of Chemistry and Materials Science, University
of Science and Technology of China, Hefei, Anhui 230026, China
| | - Jiahui Zhang
- School
of Software Engineering, University of Science
and Technology of China, Hefei, Anhui 230026, China
- Suzhou
Institute for Advanced Research, University
of Science and Technology of China, Suzhou, Jiangsu 215123, China
| | - Di Wu
- School
of Software Engineering, University of Science
and Technology of China, Hefei, Anhui 230026, China
- Suzhou
Institute for Advanced Research, University
of Science and Technology of China, Suzhou, Jiangsu 215123, China
| | - Edward Sharman
- Department
of Neurology, University of California, Irvine, California 92697, United States
| | - Jun Jiang
- Key
Laboratory of Precision and Intelligent Chemistry, University of Science and Technology of China, Hefei, Anhui 230026, China
- School
of Chemistry and Materials Science, University
of Science and Technology of China, Hefei, Anhui 230026, China
| | - Yang Wang
- Key
Laboratory of Precision and Intelligent Chemistry, University of Science and Technology of China, Hefei, Anhui 230026, China
- School
of Software Engineering, University of Science
and Technology of China, Hefei, Anhui 230026, China
- Suzhou
Institute for Advanced Research, University
of Science and Technology of China, Suzhou, Jiangsu 215123, China
| |
Collapse
|
7
|
Bi X, Lin L, Chen Z, Ye J. Artificial Intelligence for Surface-Enhanced Raman Spectroscopy. SMALL METHODS 2024; 8:e2301243. [PMID: 37888799 DOI: 10.1002/smtd.202301243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 10/11/2023] [Indexed: 10/28/2023]
Abstract
Surface-enhanced Raman spectroscopy (SERS), well acknowledged as a fingerprinting and sensitive analytical technique, has exerted high applicational value in a broad range of fields including biomedicine, environmental protection, food safety among the others. In the endless pursuit of ever-sensitive, robust, and comprehensive sensing and imaging, advancements keep emerging in the whole pipeline of SERS, from the design of SERS substrates and reporter molecules, synthetic route planning, instrument refinement, to data preprocessing and analysis methods. Artificial intelligence (AI), which is created to imitate and eventually exceed human behaviors, has exhibited its power in learning high-level representations and recognizing complicated patterns with exceptional automaticity. Therefore, facing up with the intertwining influential factors and explosive data size, AI has been increasingly leveraged in all the above-mentioned aspects in SERS, presenting elite efficiency in accelerating systematic optimization and deepening understanding about the fundamental physics and spectral data, which far transcends human labors and conventional computations. In this review, the recent progresses in SERS are summarized through the integration of AI, and new insights of the challenges and perspectives are provided in aim to better gear SERS toward the fast track.
Collapse
Affiliation(s)
- Xinyuan Bi
- State Key Laboratory of Systems Medicine for Cancer, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200030, P. R. China
| | - Li Lin
- State Key Laboratory of Systems Medicine for Cancer, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200030, P. R. China
| | - Zhou Chen
- State Key Laboratory of Systems Medicine for Cancer, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200030, P. R. China
| | - Jian Ye
- State Key Laboratory of Systems Medicine for Cancer, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200030, P. R. China
- Institute of Medical Robotics, Shanghai Jiao Tong University, Shanghai, 200127, P. R. China
- Shanghai Key Laboratory of Gynecologic Oncology, Ren Ji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200127, P. R. China
| |
Collapse
|
8
|
Zhao Y, Li H, Shan J, Zhang Z, Li X, Shi JQ, Jiao Y, Li H. Machine Learning Confirms the Formation Mechanism of a Single-Atom Catalyst via Infrared Spectroscopic Analysis. J Phys Chem Lett 2023:11058-11062. [PMID: 38048178 DOI: 10.1021/acs.jpclett.3c02896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/06/2023]
Abstract
Single-atom catalysts (SACs) offer significant potential across various applications, yet our understanding of their formation mechanism remains limited. Notably, the pyrolysis of zeolitic imidazolate frameworks (ZIFs) stands as a pivotal avenue for SAC synthesis, of which the mechanism can be assessed through infrared (IR) spectroscopy. However, the prevailing analysis techniques still rely on manual interpretation. Here, we report a machine learning (ML)-driven analysis of the IR spectroscopy to unravel the pyrolysis process of Pt-doped ZIF-67 to synthesize Pt-Co3O4 SAC. Demonstrating a total Pearson correlation exceeding 0.7 with experimental data, the algorithm provides correlation coefficients for the selected structures, thereby confirming crucial structural changes with time and temperature, including the decomposition of ZIF and formation of Pt-O bonds. These findings reveal and confirm the formation mechanism of SACs. As demonstrated, the integration of ML algorithms, theoretical simulations, and experimental spectral analysis introduces an approach to deciphering experimental characterization data, implying its potential for broader adoption.
Collapse
Affiliation(s)
- Yanzhang Zhao
- School of Chemical Engineering, The University of Adelaide, Adelaide, South Australia 5005, Australia
| | - Huan Li
- School of Chemical Engineering, The University of Adelaide, Adelaide, South Australia 5005, Australia
| | - Jieqiong Shan
- School of Chemical Engineering, The University of Adelaide, Adelaide, South Australia 5005, Australia
- Department of Chemistry, City University of Hong Kong, Kowloon 999077, Hong Kong Special Administrative Region of the People's Republic of China
| | - Zhen Zhang
- Australian Institute for Machine Learning, The University of Adelaide, Adelaide, South Australia 5000, Australia
| | - Xinyu Li
- Australian Institute for Machine Learning, The University of Adelaide, Adelaide, South Australia 5000, Australia
| | - Javen Qinfeng Shi
- Australian Institute for Machine Learning, The University of Adelaide, Adelaide, South Australia 5000, Australia
| | - Yan Jiao
- School of Chemical Engineering, The University of Adelaide, Adelaide, South Australia 5005, Australia
| | - Haobo Li
- School of Chemical Engineering, The University of Adelaide, Adelaide, South Australia 5005, Australia
| |
Collapse
|
9
|
Feng S, Cai A, Wang Y, Zhang B, Qiao Q, Chen C, Wang S, Jiang J. A robotic AI-Chemist system for multi-modal AI-ready database. Natl Sci Rev 2023; 10:nwad332. [PMID: 38226367 PMCID: PMC10789233 DOI: 10.1093/nsr/nwad332] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Revised: 10/16/2023] [Accepted: 10/31/2023] [Indexed: 01/17/2024] Open
Abstract
By fusing literature data mining, high-performance simulations, and high-accuracy experiments, robotic AI-Chemist can achieve automated high-throughput production, classification, cleaning, association and fusion of data, and thus develop a multi-modal AI-ready database.
Collapse
Affiliation(s)
- Shuo Feng
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, China
| | - Aoran Cai
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, China
| | - Yang Wang
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, China
| | - Baicheng Zhang
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, China
| | - Qinyu Qiao
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, China
| | - Cheng Chen
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, China
| | - Song Wang
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, China
| | - Jun Jiang
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, China
| |
Collapse
|
10
|
Yang T, Zhou D, Ye S, Li X, Li H, Feng Y, Jiang Z, Yang L, Ye K, Shen Y, Jiang S, Feng S, Zhang G, Huang Y, Wang S, Jiang J. Catalytic Structure Design by AI Generating with Spectroscopic Descriptors. J Am Chem Soc 2023. [PMID: 38019281 DOI: 10.1021/jacs.3c09299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2023]
Abstract
Generative artificial intelligence has depicted a beautiful blueprint for on-demand design in chemical research. However, the few successful chemical generations have only been able to implement a few special property values because most chemical descriptors are mathematically discrete or discontinuously adjustable. Herein, we use spectroscopic descriptors with machine learning to establish a quantitative spectral structure-property relationship for adsorbed molecules on metal monatomic catalysts. Besides catalytic properties such as adsorption energy and charge transfer, the complete spatial relative coordinates of the adsorbed molecule were successfully inverted. The spectroscopic descriptors and prediction models are generalized, allowing them to be transferred to several different systems. Due to the continuous tunability of the spectroscopic descriptors, the design of catalytic structures with continuous adsorption states generated by AI in the catalytic process has been achieved. This work paves the way for using spectroscopy to enable real-time monitoring of the catalytic process and continuous customization of catalytic performance, which will lead to profound changes in catalytic research.
Collapse
Affiliation(s)
- Tongtong Yang
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
- Institute of Intelligent Innovation, Henan Academy of Sciences, Zhengzhou, Henan 451162, P. R. China
| | - Donglai Zhou
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Sheng Ye
- School of Artificial Intelligence, Anhui University, Hefei, Anhui 230601, China
| | - Xiyu Li
- Songshan Lake Materials Laboratory, Dongguan, Guangdong 523808, China
| | - Huirong Li
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Yi Feng
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Zifan Jiang
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Li Yang
- Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui 230601, China
| | - Ke Ye
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Yixi Shen
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Shuang Jiang
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Shuo Feng
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Guozhen Zhang
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Yan Huang
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Song Wang
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Jun Jiang
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| |
Collapse
|
11
|
Guo S, Jiang J, Ren H, Wang S. Fusion of Multiple Spectra for Investigating Chemical Bonding Properties via Machine Learning. J Phys Chem Lett 2023; 14:7461-7468. [PMID: 37579021 DOI: 10.1021/acs.jpclett.3c01709] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/16/2023]
Abstract
Chemical bonding properties are crucial to understanding the chemical behavior of molecules. Spectroscopy is a versatile technical tool to study various microscopic properties, but its interpretation suffers from human biases and the loss of high-dimensional information. Here, we present a machine learning approach to predict diverse bonding properties, including the bond dissociation energy, bond length, and α-C connectivity of hydroxyls in organic molecules, by fusing multiple spectra with different physical mechanisms. Combining nuclear magnetic resonance and vibrational spectroscopy exhibits higher prediction accuracy than what they did separately. On the hold-out test data set, the models achieve a mean absolute error of 1.243 kcal/mol and 1.041 × 10-4 Å for BDE and bond length and an accuracy of 95.09% for hydroxyl α-C connectivity. Our models demonstrate strong extrapolation capabilities when they are transferred to different molecules, external electric fields, and solvation environments. These end-to-end models pave the way to investigating chemical bonding properties by using spectroscopic observables.
Collapse
Affiliation(s)
- Sibei Guo
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Jun Jiang
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
- Hefei National Laboratory, University of Science and Technology of China, Hefei, Anhui 230088, China
| | - Hao Ren
- School of Materials Science and Engineering, China University of Petroleum (East China), Qingdao, Shandong 266580, China
| | - Song Wang
- Key Laboratory of Precision and Intelligent Chemistry, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui 230026, China
| |
Collapse
|
12
|
Mou LH, Han T, Smith PES, Sharman E, Jiang J. Machine Learning Descriptors for Data-Driven Catalysis Study. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2023:e2301020. [PMID: 37191279 PMCID: PMC10401178 DOI: 10.1002/advs.202301020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 04/07/2023] [Indexed: 05/17/2023]
Abstract
Traditional trial-and-error experiments and theoretical simulations have difficulty optimizing catalytic processes and developing new, better-performing catalysts. Machine learning (ML) provides a promising approach for accelerating catalysis research due to its powerful learning and predictive abilities. The selection of appropriate input features (descriptors) plays a decisive role in improving the predictive accuracy of ML models and uncovering the key factors that influence catalytic activity and selectivity. This review introduces tactics for the utilization and extraction of catalytic descriptors in ML-assisted experimental and theoretical research. In addition to the effectiveness and advantages of various descriptors, their limitations are also discussed. Highlighted are both 1) newly developed spectral descriptors for catalytic performance prediction and 2) a novel research paradigm combining computational and experimental ML models through suitable intermediate descriptors. Current challenges and future perspectives on the application of descriptors and ML techniques to catalysis are also presented.
Collapse
Affiliation(s)
- Li-Hui Mou
- Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui, 230026, China
| | - TianTian Han
- Hefei JiShu Quantum Technology Co. Ltd., Hefei, 230026, China
| | - Pieter E S Smith
- YDS Pharmatech, ETEC, 1220 Washington Ave., Albany, NY, 12203, USA
| | - Edward Sharman
- Department of Neurology, University of California, Irvine, CA, 92697, USA
| | - Jun Jiang
- Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei, Anhui, 230026, China
| |
Collapse
|
13
|
Chong Y, Huo Y, Jiang S, Wang X, Zhang B, Liu T, Chen X, Han T, Smith P, Wang S, Jiang J. Machine learning of spectra-property relationship for imperfect and small chemistry data. Proc Natl Acad Sci U S A 2023; 120:e2220789120. [PMID: 37155896 PMCID: PMC10193941 DOI: 10.1073/pnas.2220789120] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 03/16/2023] [Indexed: 05/10/2023] Open
Abstract
Machine learning (ML) is causing profound changes to chemical research through its powerful statistical and mathematical methodological capabilities. However, the nature of chemistry experiments often sets very high hurdles to collect high-quality data that are deficiency free, contradicting the need of ML to learn from big data. Even worse, the black-box nature of most ML methods requires more abundant data to ensure good transferability. Herein, we combine physics-based spectral descriptors with a symbolic regression method to establish interpretable spectra-property relationship. Using the machine-learned mathematical formulas, we have predicted the adsorption energy and charge transfer of the CO-adsorbed Cu-based MOF systems from their infrared and Raman spectra. The explicit prediction models are robust, allowing them to be transferrable to small and low-quality dataset containing partial errors. Surprisingly, they can be used to identify and clean error data, which are common data scenarios in real experiments. Such robust learning protocol will significantly enhance the applicability of machine-learned spectroscopy for chemical science.
Collapse
Affiliation(s)
- Yuanyuan Chong
- Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei230026, China
| | - Yaoyuan Huo
- Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei230026, China
| | - Shuang Jiang
- Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei230026, China
| | - Xijun Wang
- Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei230026, China
| | - Baichen Zhang
- Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei230026, China
| | - Tianfu Liu
- State Key Laboratory of Structural Chemistry, Fujian Institute of Research on the Structure of Matter, Chinese Academy of Science, Fuzhou350002, China
| | - Xin Chen
- GuSu Laboratory of Materials, Suzhou215123, China
| | - TianTian Han
- Hefei JiShu Quantum Technology Co. Ltd., Hefei230026, China
| | | | - Song Wang
- Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei230026, China
| | - Jun Jiang
- Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei230026, China
| |
Collapse
|
14
|
Li H, Jiao Y, Davey K, Qiao SZ. Data-Driven Machine Learning for Understanding Surface Structures of Heterogeneous Catalysts. Angew Chem Int Ed Engl 2023; 62:e202216383. [PMID: 36509704 DOI: 10.1002/anie.202216383] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 12/11/2022] [Accepted: 12/12/2022] [Indexed: 12/15/2022]
Abstract
The design of heterogeneous catalysts is necessarily surface-focused, generally achieved via optimization of adsorption energy and microkinetic modelling. A prerequisite is to ensure the adsorption energy is physically meaningful is the stable existence of the conceived active-site structure on the surface. The development of improved understanding of the catalyst surface, however, is challenging practically because of the complex nature of dynamic surface formation and evolution under in-situ reactions. We propose therefore data-driven machine-learning (ML) approaches as a solution. In this Minireview we summarize recent progress in using machine-learning to search and predict (meta)stable structures, assist operando simulation under reaction conditions and micro-environments, and critically analyze experimental characterization data. We conclude that ML will become the new norm to lower costs associated with discovery and design of optimal heterogeneous catalysts.
Collapse
Affiliation(s)
- Haobo Li
- School of Chemical Engineering and Advanced Materials, The University of Adelaide, Adelaide, SA 5005, Australia
| | - Yan Jiao
- School of Chemical Engineering and Advanced Materials, The University of Adelaide, Adelaide, SA 5005, Australia
| | - Kenneth Davey
- School of Chemical Engineering and Advanced Materials, The University of Adelaide, Adelaide, SA 5005, Australia
| | - Shi-Zhang Qiao
- School of Chemical Engineering and Advanced Materials, The University of Adelaide, Adelaide, SA 5005, Australia
| |
Collapse
|