1
|
E U, T M. DICCA-DTA: Diffusion and Contextualized Capsule Attention guided Factorized Cross-Pooling for Drug-Target Affinity prediction. Comput Biol Chem 2025; 118:108472. [PMID: 40288256 DOI: 10.1016/j.compbiolchem.2025.108472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2025] [Revised: 03/27/2025] [Accepted: 04/08/2025] [Indexed: 04/29/2025]
Abstract
Drug-Target Affinity (DTA) prediction plays a crucial role in the drug discovery process by evaluating the strength of the interaction between a drug and its biological target, which is often a protein. Despite advancements in DTA prediction through deep learning, several fundamental challenges persist: (i) suboptimal information propagation in molecular graphs, limiting the effective representation of complex drug structures, (ii) accurately modeling the complex interactions between drug-binding sites and protein substructures, and (iii) prioritizing critical substructure interactions to enhance both accuracy and interpretability. To address these challenges, the DICCA-DTA framework is introduced, aiming to improve the contextual integration of molecular information and facilitate a more comprehensive representation of drug-target interactions in allopathic research. It employs a Diffused Isomorphic Network (DIN) to extract comprehensive drug features from molecular graphs, capturing both local substructures and global information. Furthermore, a Contextualized Capsule Attention Network (CCAN) module incorporates multi-head attention with capsule networks to capture both local and global protein sequence characteristics. The attention-guided Factorized Cross-Pooling (FCP) mechanism dynamically refines drug-protein interaction modeling by selectively emphasizing critical binding site interactions, thereby enhancing predictive accuracy. Explainable attention maps further reveal the most crucial drug-protein binding site interactions, providing transparent insights into the model's decision-making process. Comprehensive evaluations across the Davis, KIBA, Metz and BindingDB datasets demonstrate the superior performance of the DICCA-DTA framework over existing state-of-the-art models. A case study on cancer-related protein interactions from the DrugBank database further demonstrates the framework's precision in identifying key drug-protein affinities, reinforcing its potential to accelerate drug discovery and repurposing.
Collapse
Affiliation(s)
- Uma E
- Department of Information Science and Technology, College of Engineering Guindy, Chennai, India.
| | - Mala T
- Department of Information Science and Technology, College of Engineering Guindy, Chennai, India
| |
Collapse
|
2
|
Rosa LS, Sarhan M, Pimentel AS. Toxic Alerts of Endocrine Disruption Revealed by Explainable Artificial Intelligence. ENVIRONMENT & HEALTH (WASHINGTON, D.C.) 2025; 3:321-333. [PMID: 40144324 PMCID: PMC11934200 DOI: 10.1021/envhealth.4c00218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/17/2024] [Revised: 01/10/2025] [Accepted: 01/16/2025] [Indexed: 03/28/2025]
Abstract
The local interpretable model-agnostic explanation method was used to unveil substructures (toxic alerts) that cause endocrine disruption in chemical compounds using machine learning models. The random forest classifier was applied to build explainable models with the TOX21 data sets after data curation. Using these models applied to the EDC and EDKB-FDA data sets, the substructures that cause endocrine disruption in chemical compounds were unveiled, providing stable, more specific, and consistent explanations, which are essential for trust and acceptance of the findings, mainly due to the difficulty of finding relevant experimental evidence for different receptors (androgen, estrogen, aryl hydrocarbon, aromatase, and peroxisome proliferator-activated receptors). This approach is significant because of its contribution to the interpretability of explainable machine learning algorithms, particularly in the context of unveiling substructures associated with endocrine disruption in five targets (androgen receptor, estrogen receptor, aryl hydrocarbon receptors, aromatase receptors, and peroxisome proliferator-activated receptors), thereby advancing the relevant field of environmental toxicology, where a careful evaluation of the potential risks of exposure to new compounds is needed. The specific substructures thiophosphate, sulfamate, anilide, carbamate, sulfamide, and thiocyanate are presented as toxic alerts that cause endocrine disruption to better understand their potential risks and adverse effects on human health and the environment.
Collapse
Affiliation(s)
- Lucca
Caiaffa Santos Rosa
- Departamento de Química, Pontifícia Universidade Católica do
Rio de Janeiro, Rio de
Janeiro, RJ 22453-900, Brazil
| | - Mariam Sarhan
- Departamento de Química, Pontifícia Universidade Católica do
Rio de Janeiro, Rio de
Janeiro, RJ 22453-900, Brazil
| | - Andre Silva Pimentel
- Departamento de Química, Pontifícia Universidade Católica do
Rio de Janeiro, Rio de
Janeiro, RJ 22453-900, Brazil
| |
Collapse
|
3
|
Contreras J, Mostafapour S, Popp J, Bocklitz T. Siamese Networks for Clinically Relevant Bacteria Classification Based on Raman Spectroscopy. Molecules 2024; 29:1061. [PMID: 38474573 PMCID: PMC10934697 DOI: 10.3390/molecules29051061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 02/07/2024] [Accepted: 02/27/2024] [Indexed: 03/14/2024] Open
Abstract
Identifying bacterial strains is essential in microbiology for various practical applications, such as disease diagnosis and quality monitoring of food and water. Classical machine learning algorithms have been utilized to identify bacteria based on their Raman spectra. However, convolutional neural networks (CNNs) offer higher classification accuracy, but they require extensive training sets and retraining of previous untrained class targets can be costly and time-consuming. Siamese networks have emerged as a promising solution. They are composed of two CNNs with the same structure and a final network that acts as a distance metric, converting the classification problem into a similarity problem. Classical machine learning approaches, shallow and deep CNNs, and two Siamese network variants were tailored and tested on Raman spectral datasets of bacteria. The methods were evaluated based on mean sensitivity, training time, prediction time, and the number of parameters. In this comparison, Siamese-model2 achieved the highest mean sensitivity of 83.61 ± 4.73 and demonstrated remarkable performance in handling unbalanced and limited data scenarios, achieving a prediction accuracy of 73%. Therefore, the choice of model depends on the specific trade-off between accuracy, (prediction/training) time, and resources for the particular application. Classical machine learning models and shallow CNN models may be more suitable if time and computational resources are a concern. Siamese networks are a good choice for small datasets and CNN for extensive data.
Collapse
Affiliation(s)
- Jhonatan Contreras
- Institute of Physical Chemistry (IPC) and Abbe Center of Photonics (ACP), Friedrich Schiller University Jena, Member of the Leibniz Centre for Photonics in Infection Research (LPI), Helmholtzweg 4, 07743 Jena, Germany; (J.C.); (S.M.); (J.P.)
- Leibniz Institute of Photonic Technology, Member of Leibniz Health Technologies, Member of the Leibniz, Centre for Photonics in Infection Research (LPI), Albert Einstein Straße 9, 07745 Jena, Germany
| | - Sara Mostafapour
- Institute of Physical Chemistry (IPC) and Abbe Center of Photonics (ACP), Friedrich Schiller University Jena, Member of the Leibniz Centre for Photonics in Infection Research (LPI), Helmholtzweg 4, 07743 Jena, Germany; (J.C.); (S.M.); (J.P.)
| | - Jürgen Popp
- Institute of Physical Chemistry (IPC) and Abbe Center of Photonics (ACP), Friedrich Schiller University Jena, Member of the Leibniz Centre for Photonics in Infection Research (LPI), Helmholtzweg 4, 07743 Jena, Germany; (J.C.); (S.M.); (J.P.)
- Leibniz Institute of Photonic Technology, Member of Leibniz Health Technologies, Member of the Leibniz, Centre for Photonics in Infection Research (LPI), Albert Einstein Straße 9, 07745 Jena, Germany
| | - Thomas Bocklitz
- Institute of Physical Chemistry (IPC) and Abbe Center of Photonics (ACP), Friedrich Schiller University Jena, Member of the Leibniz Centre for Photonics in Infection Research (LPI), Helmholtzweg 4, 07743 Jena, Germany; (J.C.); (S.M.); (J.P.)
- Leibniz Institute of Photonic Technology, Member of Leibniz Health Technologies, Member of the Leibniz, Centre for Photonics in Infection Research (LPI), Albert Einstein Straße 9, 07745 Jena, Germany
- Institute of Computer Science, Faculty of Mathematics, Physics & Computer Science, University Bayreuth Universitaetsstraße 30, 95447 Bayreuth, Germany
| |
Collapse
|
4
|
Fu X, Jiang J, Wu X, Huang L, Han R, Li K, Liu C, Roy K, Chen J, Mahmoud NTA, Wang Z. Deep learning in water protection of resources, environment, and ecology: achievement and challenges. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2024; 31:14503-14536. [PMID: 38305966 DOI: 10.1007/s11356-024-31963-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 01/06/2024] [Indexed: 02/03/2024]
Abstract
The breathtaking economic development put a heavy toll on ecology, especially on water pollution. Efficient water resource management has a long-term influence on the sustainable development of the economy and society. Economic development and ecology preservation are tangled together, and the growth of one is not possible without the other. Deep learning (DL) is ubiquitous in autonomous driving, medical imaging, speech recognition, etc. The spectacular success of deep learning comes from its power of richer representation of data. In view of the bright prospects of DL, this review comprehensively focuses on the development of DL applications in water resources management, water environment protection, and water ecology. First, the concept and modeling steps of DL are briefly introduced, including data preparation, algorithm selection, and model evaluation. Finally, the advantages and disadvantages of commonly used algorithms are analyzed according to their structures and mechanisms, and recommendations on the selection of DL algorithms for different studies, as well as prospects for the application and development of DL in water science are proposed. This review provides references for solving a wider range of water-related problems and brings further insights into the intelligent development of water science.
Collapse
Affiliation(s)
- Xiaohua Fu
- Ecological Environment Management and Assessment Center, Central South University of Forestry and Technology, Changsha, 410004, People's Republic of China
| | - Jie Jiang
- Ecological Environment Management and Assessment Center, Central South University of Forestry and Technology, Changsha, 410004, People's Republic of China
- State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, Ministry of Ecology and Environment, South China Institute of Environmental Sciences, Guangzhou, 510655, People's Republic of China
| | - Xie Wu
- China Railway Water Information Technology Co, LTD, Nanchang, 330000, People's Republic of China
| | - Lei Huang
- School of Environmental Science and Engineering, Guangzhou University, Guangzhou, 510006, People's Republic of China
| | - Rui Han
- China Environment Publishing Group, Beijing, 100062, People's Republic of China
| | - Kun Li
- Freeman Business School, Tulane University, New Orleans, LA, 70118, USA
- Guangzhou Huacai Environmental Protection Technology Co., Ltd, Guangzhou, 511480, People's Republic of China
| | - Chang Liu
- State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, Ministry of Ecology and Environment, South China Institute of Environmental Sciences, Guangzhou, 510655, People's Republic of China
| | - Kallol Roy
- Institute of Computer Science, University of Tartu, 51009, Tartu, Estonia
| | - Jianyu Chen
- State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, Ministry of Ecology and Environment, South China Institute of Environmental Sciences, Guangzhou, 510655, People's Republic of China
| | | | - Zhenxing Wang
- State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, Ministry of Ecology and Environment, South China Institute of Environmental Sciences, Guangzhou, 510655, People's Republic of China.
| |
Collapse
|
5
|
Cai L, Han F, Ji B, He X, Wang L, Niu T, Zhai J, Wang J. In Silico Screening of Natural Flavonoids against 3-Chymotrypsin-like Protease of SARS-CoV-2 Using Machine Learning and Molecular Modeling. Molecules 2023; 28:8034. [PMID: 38138524 PMCID: PMC10745665 DOI: 10.3390/molecules28248034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 11/30/2023] [Accepted: 12/07/2023] [Indexed: 12/24/2023] Open
Abstract
The "Long-COVID syndrome" has posed significant challenges due to a lack of validated therapeutic options. We developed a novel multi-step virtual screening strategy to reliably identify inhibitors against 3-chymotrypsin-like protease of SARS-CoV-2 from abundant flavonoids, which represents a promising source of antiviral and immune-boosting nutrients. We identified 57 interacting residues as contributors to the protein-ligand binding pocket. Their energy interaction profiles constituted the input features for Machine Learning (ML) models. The consensus of 25 classifiers trained using various ML algorithms attained 93.9% accuracy and a 6.4% false-positive-rate. The consensus of 10 regression models for binding energy prediction also achieved a low root-mean-square error of 1.18 kcal/mol. We screened out 120 flavonoid hits first and retained 50 drug-like hits after predefined ADMET filtering to ensure bioavailability and safety profiles. Furthermore, molecular dynamics simulations prioritized nine bioactive flavonoids as promising anti-SARS-CoV-2 agents exhibiting both high structural stability (root-mean-square deviation < 5 Å for 218 ns) and low MM/PBSA binding free energy (<-6 kcal/mol). Among them, KB-2 (PubChem-CID, 14630497) and 9-O-Methylglyceofuran (PubChem-CID, 44257401) displayed excellent binding affinity and desirable pharmacokinetic capabilities. These compounds have great potential to serve as oral nutraceuticals with therapeutic and prophylactic properties as care strategies for patients with long-COVID syndrome.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Junmei Wang
- School of Pharmacy, University of Pittsburgh, Pittsburgh, PA 15261, USA; (L.C.); (F.H.); (B.J.); (X.H.); (L.W.); (T.N.); (J.Z.)
| |
Collapse
|
6
|
Zhu Z, Yao Z, Zheng X, Qi G, Li Y, Mazur N, Gao X, Gong Y, Cong B. Drug-target affinity prediction method based on multi-scale information interaction and graph optimization. Comput Biol Med 2023; 167:107621. [PMID: 37907030 DOI: 10.1016/j.compbiomed.2023.107621] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 10/16/2023] [Accepted: 10/23/2023] [Indexed: 11/02/2023]
Abstract
Drug-target affinity (DTA) prediction as an emerging and effective method is widely applied to explore the strength of drug-target interactions in drug development research. By predicting these interactions, researchers can assess the potential efficacy and safety of candidate drugs at an early stage, narrowing down the search space for therapeutic targets and accelerating the discovery and development of new drugs. However, existing DTA prediction models mainly use graphical representations of drug molecules, which lack information on interactions between individual substructures, thus affecting prediction accuracy and model interpretability. Therefore, transformer and diffusion on drug graphs in DTA prediction (TDGraphDTA) are introduced to predict drug-target interactions using multi-scale information interaction and graph optimization. An interactive module is integrated into feature extraction of drug and target features at different granularity levels. A diffusion model-based graph optimization module is proposed to improve the representation of molecular graph structures and enhance the interpretability of graph representations while obtaining optimal feature representations. In addition, TDGraphDTA improves the accuracy and reliability of predictions by capturing relationships and contextual information between molecular substructures. The performance of the proposed TDGraphDTA in DTA prediction was verified on three publicly available benchmark datasets (Davis, Metz, and KIBA). Compared with state-of-the-art baseline models, it achieved better results in terms of consistency index, R-squared, etc. Furthermore, compared with some existing methods, the proposed TDGraphDTA is demonstrated to have better structure capturing capabilities by visualizing the feature capturing capabilities of the model using Grad-AAM toxicity labels in the ToxCast dataset. The corresponding source codes are available at https://github.com/Lamouryz/TDGraph.
Collapse
Affiliation(s)
- Zhiqin Zhu
- College of Automation, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| | - Zheng Yao
- College of Automation, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| | - Xin Zheng
- College of Automation, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| | - Guanqiu Qi
- Computer Information Systems Department, State University of New York at Buffalo State, Buffalo, NY 14222, USA.
| | - Yuanyuan Li
- College of Automation, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| | - Neal Mazur
- Computer Information Systems Department, State University of New York at Buffalo State, Buffalo, NY 14222, USA.
| | - Xinbo Gao
- College of Automation, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| | - Yifei Gong
- Faculty of applied science & engineering, the Edward S. Rogers Sr. Department of Electrical & Computer Engineering (ECE), University of Toronto at Toronto, ON M5S, Canada.
| | - Baisen Cong
- Diagnostics Digital, DH(Shanghai) Diagnostics Co, Ltd, a Danaher company, Shanghai, 200335, China.
| |
Collapse
|
7
|
Lederer J, Gastegger M, Schütt KT, Kampffmeyer M, Müller KR, Unke OT. Automatic identification of chemical moieties. Phys Chem Chem Phys 2023; 25:26370-26379. [PMID: 37750554 PMCID: PMC10548786 DOI: 10.1039/d3cp03845a] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 08/18/2023] [Indexed: 09/27/2023]
Abstract
In recent years, the prediction of quantum mechanical observables with machine learning methods has become increasingly popular. Message-passing neural networks (MPNNs) solve this task by constructing atomic representations, from which the properties of interest are predicted. Here, we introduce a method to automatically identify chemical moieties (molecular building blocks) from such representations, enabling a variety of applications beyond property prediction, which otherwise rely on expert knowledge. The required representation can either be provided by a pretrained MPNN, or be learned from scratch using only structural information. Beyond the data-driven design of molecular fingerprints, the versatility of our approach is demonstrated by enabling the selection of representative entries in chemical databases, the automatic construction of coarse-grained force fields, as well as the identification of reaction coordinates.
Collapse
Affiliation(s)
- Jonas Lederer
- Berlin Institute of Technology (TU Berlin), 10587 Berlin, Germany.
- BIFOLD - Berlin Institute for the Foundations of Learning and Data, Germany
| | - Michael Gastegger
- Berlin Institute of Technology (TU Berlin), 10587 Berlin, Germany.
- BIFOLD - Berlin Institute for the Foundations of Learning and Data, Germany
| | - Kristof T Schütt
- Berlin Institute of Technology (TU Berlin), 10587 Berlin, Germany.
- BIFOLD - Berlin Institute for the Foundations of Learning and Data, Germany
| | - Michael Kampffmeyer
- Department of Physics and Technology, UiT The Arctic University of Norway, 9019 Tromsø, Norway
| | - Klaus-Robert Müller
- Berlin Institute of Technology (TU Berlin), 10587 Berlin, Germany.
- BIFOLD - Berlin Institute for the Foundations of Learning and Data, Germany
- Google Deepmind, Germany
- Department of Artificial Intelligence, Korea University, Seoul 136-713, Korea
- Max Planck Institut für Informatik, 66123 Saarbrücken, Germany
| | - Oliver T Unke
- Berlin Institute of Technology (TU Berlin), 10587 Berlin, Germany.
- BIFOLD - Berlin Institute for the Foundations of Learning and Data, Germany
- Google Deepmind, Germany
| |
Collapse
|
8
|
Dou B, Zhu Z, Merkurjev E, Ke L, Chen L, Jiang J, Zhu Y, Liu J, Zhang B, Wei GW. Machine Learning Methods for Small Data Challenges in Molecular Science. Chem Rev 2023; 123:8736-8780. [PMID: 37384816 PMCID: PMC10999174 DOI: 10.1021/acs.chemrev.3c00189] [Citation(s) in RCA: 79] [Impact Index Per Article: 39.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023]
Abstract
Small data are often used in scientific and engineering research due to the presence of various constraints, such as time, cost, ethics, privacy, security, and technical limitations in data acquisition. However, big data have been the focus for the past decade, small data and their challenges have received little attention, even though they are technically more severe in machine learning (ML) and deep learning (DL) studies. Overall, the small data challenge is often compounded by issues, such as data diversity, imputation, noise, imbalance, and high-dimensionality. Fortunately, the current big data era is characterized by technological breakthroughs in ML, DL, and artificial intelligence (AI), which enable data-driven scientific discovery, and many advanced ML and DL technologies developed for big data have inadvertently provided solutions for small data problems. As a result, significant progress has been made in ML and DL for small data challenges in the past decade. In this review, we summarize and analyze several emerging potential solutions to small data challenges in molecular science, including chemical and biological sciences. We review both basic machine learning algorithms, such as linear regression, logistic regression (LR), k-nearest neighbor (KNN), support vector machine (SVM), kernel learning (KL), random forest (RF), and gradient boosting trees (GBT), and more advanced techniques, including artificial neural network (ANN), convolutional neural network (CNN), U-Net, graph neural network (GNN), Generative Adversarial Network (GAN), long short-term memory (LSTM), autoencoder, transformer, transfer learning, active learning, graph-based semi-supervised learning, combining deep learning with traditional machine learning, and physical model-based data augmentation. We also briefly discuss the latest advances in these methods. Finally, we conclude the survey with a discussion of promising trends in small data challenges in molecular science.
Collapse
Affiliation(s)
- Bozheng Dou
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Zailiang Zhu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Ekaterina Merkurjev
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Lu Ke
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Long Chen
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Jian Jiang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Yueying Zhu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Jie Liu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Bengong Zhang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
9
|
Bich VNT, Nguyen TK, Thu TDT, Tran LTT, Nguyen SVD, Han HL, Pham LHD, Thanh TH, Duong VH, Trieu TA, Tran MH, Pham PTV. Investigating the antibacterial mechanism of Ampelopsis cantoniensis extracts against methicillin-resistant Staphylococcus aureus via in vitro and in silico analysis. J Biomol Struct Dyn 2023; 41:14080-14091. [PMID: 36889929 DOI: 10.1080/07391102.2023.2187218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Accepted: 01/31/2023] [Indexed: 03/10/2023]
Abstract
Methicillin-resistant Staphylococcus aureus (MRSA) is a critical pathogen responsible for a wide variety of serious infectious diseases in humans. The accelerated phenomena of drug tolerance, drug resistance, and dysbacteriosis provoked by antibiotic misuse are impeding the effectiveness of contemporary antibiotic therapies primarily used to treat this common worldwide pathogen. In this study, the antibacterial activity of 70% ethanol extract and multiple polar solvents of Ampelopsis cantoniensis were measured against the clinical MRSA isolate. The agar diffusion technique was employed to determine the zone of inhibition (ZOI), accompanied by the use of a microdilution series to identify the minimal inhibitory concentration (MIC) and minimal bactericidal concentration (MBC). Our results revealed that the ethyl acetate fraction exhibited the most significant antibacterial activity, which was determined to be bacteriostatic based on the MBC/MIC ratio 8. A list of compounds isolated from A. cantoniensis was computationally studied to further investigate the mechanism of action with the bacterial membrane protein PBP2a. The combination of molecular docking and molecular dynamics methods showed that the main compound, dihydromyricetin (DHM), is expected to bind to PBP2a at allosteric site. In addition, DHM was identified as the major compound of ethyl acetate fraction, which accounts for 77.03 ± 2.44% by high performance liquid chromatography (HPLC) analysis. As a concluding remark, our study addressed the antibacterial mechanism and suggested the prioritization of natural products derived from A. cantoniensis as a potential therapy for MRSA.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Van Ngo Thai Bich
- Faculty of Chemical Engineering, The University of Danang, University of Science and Technology, Da Nang, Vietnam
| | - Tan Khanh Nguyen
- Scientific Management Department, Dong A University, Da Nang, Vietnam
| | - Thao Dao Thi Thu
- Faculty of Chemical Engineering, The University of Danang, University of Science and Technology, Da Nang, Vietnam
| | - Linh Thuy Thi Tran
- Faculty of Pharmacy, Hue University of Medicine and Pharmacy, Hue University, Hue, Vietnam
| | | | - Ho Le Han
- Scientific Management Department, Dong A University, Da Nang, Vietnam
| | | | - Trung Hoang Thanh
- Faculty of Chemical Engineering, The University of Danang, University of Science and Technology, Da Nang, Vietnam
- Family Hospital, Da Nang, Vietnam
| | - Van Hoa Duong
- Danang Department of Science and Technology, People Committee of Danang, Danang, Vietnam
| | | | - Manh Hung Tran
- School of Medicine and Pharmacy, The University of Danang, Danang, Vietnam
| | | |
Collapse
|
10
|
Su A, Zhang X, Zhang C, Ding D, Yang YF, Wang K, She YB. Deep transfer learning for predicting frontier orbital energies of organic materials using small data and its application to porphyrin photocatalysts. Phys Chem Chem Phys 2023; 25:10536-10549. [PMID: 36987933 DOI: 10.1039/d3cp00917c] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/30/2023]
Abstract
A deep transfer learning approach is used to predict HOMO/LUMO energies of organic materials with a small amount of training data.
Collapse
Affiliation(s)
- An Su
- College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, P. R. China.
| | - Xin Zhang
- College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, P. R. China.
| | - Chengwei Zhang
- College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, P. R. China.
| | - Debo Ding
- College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, P. R. China.
| | - Yun-Fang Yang
- College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, P. R. China.
| | - Keke Wang
- College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, P. R. China.
| | - Yuan-Bin She
- College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310014, P. R. China.
| |
Collapse
|
11
|
Liao J, Chen H, Wei L, Wei L. GSAML-DTA: An interpretable drug-target binding affinity prediction model based on graph neural networks with self-attention mechanism and mutual information. Comput Biol Med 2022; 150:106145. [PMID: 37859276 DOI: 10.1016/j.compbiomed.2022.106145] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 08/23/2022] [Accepted: 09/24/2022] [Indexed: 11/03/2022]
Abstract
Identifying drug-target affinity (DTA) has great practical importance in the process of designing efficacious drugs for known diseases. Recently, numerous deep learning-based computational methods have been developed to predict drug-target affinity and achieved impressive performance. However, most of them construct the molecule (drug or target) encoder without considering the weights of features of each node (atom or residue). Besides, they generally combine drug and target representations directly, which may contain irrelevant-task information. In this study, we develop GSAML-DTA, an interpretable deep learning framework for DTA prediction. GSAML-DTA integrates a self-attention mechanism and graph neural networks (GNNs) to build representations of drugs and target proteins from the structural information. In addition, mutual information is introduced to filter out redundant information and retain relevant information in the combined representations of drugs and targets. Extensive experimental results demonstrate that GSAML-DTA outperforms state-of-the-art methods for DTA prediction on two benchmark datasets. Furthermore, GSAML-DTA has the interpretation ability to analyze binding atoms and residues, which may be conducive to chemical biology studies from data. Overall, GSAML-DTA can serve as a powerful and interpretable tool suitable for DTA modelling.
Collapse
Affiliation(s)
- Jiaqi Liao
- School of Software, Shandong University, Jinan, China
| | - Haoyang Chen
- School of Software, Shandong University, Jinan, China
| | - Lesong Wei
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan.
| | - Leyi Wei
- School of Software, Shandong University, Jinan, China.
| |
Collapse
|
12
|
Lou C, Yang H, Wang J, Huang M, Li W, Liu G, Lee PW, Tang Y. IDL-PPBopt: A Strategy for Prediction and Optimization of Human Plasma Protein Binding of Compounds via an Interpretable Deep Learning Method. J Chem Inf Model 2022; 62:2788-2799. [PMID: 35607907 DOI: 10.1021/acs.jcim.2c00297] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
The prediction and optimization of pharmacokinetic properties are essential in lead optimization. Traditional strategies mainly depend on the empirical chemical rules from medicinal chemists. However, with the rising amount of data, it is getting more difficult to manually extract useful medicinal chemistry knowledge. To this end, we introduced IDL-PPBopt, a computational strategy for predicting and optimizing the plasma protein binding (PPB) property based on an interpretable deep learning method. At first, a curated PPB data set was used to construct an interpretable deep learning model, which showed excellent predictive performance with a root mean squared error of 0.112 for the entire test set. Then, we designed a detection protocol based on the model and Wilcoxon test to identify the PPB-related substructures (named privileged substructures, PSubs) for each molecule. In total, 22 general privileged substructures (GPSubs) were identified, which shared some common features such as nitrogen-containing groups, diamines with two carbon units, and azetidine. Furthermore, a series of second-level chemical rules for each GPSub were derived through a statistical test and then summarized into substructure pairs. We demonstrated that these substructure pairs were equally applicable outside the training set and accordingly customized the structural modification schemes for each GPSub, which provided alternatives for the optimization of the PPB property. Therefore, IDL-PPBopt provides a promising scheme for the prediction and optimization of the PPB property and would be helpful for lead optimization of other pharmacokinetic properties.
Collapse
Affiliation(s)
- Chaofeng Lou
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Hongbin Yang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Jiye Wang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Mengting Huang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Weihua Li
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Guixia Liu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Philip W Lee
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Yun Tang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| |
Collapse
|
13
|
Yang Z, Zhong W, Zhao L, Yu-Chian Chen C. MGraphDTA: deep multiscale graph neural network for explainable drug-target binding affinity prediction. Chem Sci 2022; 13:816-833. [PMID: 35173947 PMCID: PMC8768884 DOI: 10.1039/d1sc05180f] [Citation(s) in RCA: 128] [Impact Index Per Article: 42.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2021] [Accepted: 12/17/2021] [Indexed: 12/22/2022] Open
Abstract
Predicting drug-target affinity (DTA) is beneficial for accelerating drug discovery. Graph neural networks (GNNs) have been widely used in DTA prediction. However, existing shallow GNNs are insufficient to capture the global structure of compounds. Besides, the interpretability of the graph-based DTA models highly relies on the graph attention mechanism, which can not reveal the global relationship between each atom of a molecule. In this study, we proposed a deep multiscale graph neural network based on chemical intuition for DTA prediction (MGraphDTA). We introduced a dense connection into the GNN and built a super-deep GNN with 27 graph convolutional layers to capture the local and global structure of the compound simultaneously. We also developed a novel visual explanation method, gradient-weighted affinity activation mapping (Grad-AAM), to analyze a deep learning model from the chemical perspective. We evaluated our approach using seven benchmark datasets and compared the proposed method to the state-of-the-art deep learning (DL) models. MGraphDTA outperforms other DL-based approaches significantly on various datasets. Moreover, we show that Grad-AAM creates explanations that are consistent with pharmacologists, which may help us gain chemical insights directly from data beyond human perception. These advantages demonstrate that the proposed method improves the generalization and interpretation capability of DTA prediction modeling.
Collapse
Affiliation(s)
- Ziduo Yang
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University Shenzhen 510275 China +862039332153
| | - Weihe Zhong
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University Shenzhen 510275 China +862039332153
| | - Lu Zhao
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University Shenzhen 510275 China +862039332153
- Department of Clinical Laboratory, The Sixth Affiliated Hospital, Sun Yat-sen University Guangzhou 510655 China
| | - Calvin Yu-Chian Chen
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University Shenzhen 510275 China +862039332153
- Department of Medical Research, China Medical University Hospital Taichung 40447 Taiwan
- Department of Bioinformatics and Medical Engineering, Asia University Taichung 41354 Taiwan
| |
Collapse
|
14
|
Tynes M, Gao W, Burrill DJ, Batista ER, Perez D, Yang P, Lubbers N. Pairwise Difference Regression: A Machine Learning Meta-algorithm for Improved Prediction and Uncertainty Quantification in Chemical Search. J Chem Inf Model 2021; 61:3846-3857. [PMID: 34347460 DOI: 10.1021/acs.jcim.1c00670] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Machine learning (ML) plays a growing role in the design and discovery of chemicals, aiming to reduce the need to perform expensive experiments and simulations. ML for such applications is promising but difficult, as models must generalize to vast chemical spaces from small training sets and must have reliable uncertainty quantification metrics to identify and prioritize unexplored regions. Ab initio computational chemistry and chemical intuition alike often take advantage of differences between chemical conditions, rather than their absolute structure or state, to generate more reliable results. We have developed an analogous comparison-based approach for ML regression, called pairwise difference regression (PADRE), which is applicable to arbitrary underlying learning models and operates on pairs of input data points. During training, the model learns to predict differences between all possible pairs of input points. During prediction, the test points are paired with all training set points, giving rise to a set of predictions that can be treated as a distribution of which the mean is treated as a final prediction and the dispersion is treated as an uncertainty measure. Pairwise difference regression was shown to reliably improve the performance of the random forest algorithm across five chemical ML tasks. Additionally, the pair-derived dispersion is both well correlated with model error and performs well in active learning. We also show that this method is competitive with state-of-the-art neural network techniques. Thus, pairwise difference regression is a promising tool for candidate selection algorithms used in chemical discovery.
Collapse
Affiliation(s)
- Michael Tynes
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States.,Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Wenhao Gao
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States.,Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Daniel J Burrill
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States.,Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Enrique R Batista
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States.,Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Danny Perez
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Ping Yang
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Nicholas Lubbers
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| |
Collapse
|