1
|
Lin B, Yan S, Zhen B. A machine learning method for predicting molecular antimicrobial activity. Sci Rep 2025; 15:6559. [PMID: 39994442 PMCID: PMC11850884 DOI: 10.1038/s41598-025-91190-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Accepted: 02/18/2025] [Indexed: 02/26/2025] Open
Abstract
In response to the increasing concern over antibiotic resistance and the limitations of traditional methods in antibiotic discovery, we introduce a machine learning-based method named MFAGCN. This method predicts the antimicrobial efficacy of molecules by integrating three types of molecular fingerprints-MACCS, PubChem, and ECFP-along with molecular graph representations as input features, with a specific focus on molecular functional groups. MFAGCN incorporates an attention mechanism to assign different weights to the importance of information from different neighboring nodes. Comparative experiments with baseline models on two public datasets demonstrate MFAGCN's superior performance. Additionally, we conducted an analysis of the functional group distribution in both the training and test sets to validate the model's predictions. Furthermore, structural similarity analyses with known antibiotics are performed to prevent the rediscovery of established antibiotics. This approach enables researchers to rapidly screen molecules with potent antimicrobial properties and facilitates the identification of functional groups that influence antimicrobial performance, providing valuable insights for further antibiotic development.
Collapse
Affiliation(s)
- Bangjiang Lin
- Quanzhou Institute of Equipment Manufacturing, Haixi Institutes, Chinese Academy of Sciences, Quanzhou, 362216, China.
- College of Electrical Engineering and Automation, Fuzhou University, Fuzhou, 350108, China.
| | - Shujie Yan
- Quanzhou Institute of Equipment Manufacturing, Haixi Institutes, Chinese Academy of Sciences, Quanzhou, 362216, China
- College of Electrical Engineering and Automation, Fuzhou University, Fuzhou, 350108, China
| | - Bowen Zhen
- Quanzhou Institute of Equipment Manufacturing, Haixi Institutes, Chinese Academy of Sciences, Quanzhou, 362216, China
- College of Electrical Engineering and Automation, Fuzhou University, Fuzhou, 350108, China
| |
Collapse
|
2
|
Kengkanna A, Ohue M. Enhancing property and activity prediction and interpretation using multiple molecular graph representations with MMGX. Commun Chem 2024; 7:74. [PMID: 38580841 PMCID: PMC10997661 DOI: 10.1038/s42004-024-01155-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 03/18/2024] [Indexed: 04/07/2024] Open
Abstract
Graph Neural Networks (GNNs) excel in compound property and activity prediction, but the choice of molecular graph representations significantly influences model learning and interpretation. While atom-level molecular graphs resemble natural topology, they overlook key substructures or functional groups and their interpretation partially aligns with chemical intuition. Recent research suggests alternative representations using reduced molecular graphs to integrate higher-level chemical information and leverages both representations for model. However, there is a lack of studies about applicability and impact of different molecular graphs on model learning and interpretation. Here, we introduce MMGX (Multiple Molecular Graph eXplainable discovery), investigating the effects of multiple molecular graphs, including Atom, Pharmacophore, JunctionTree, and FunctionalGroup, on model learning and interpretation with various perspectives. Our findings indicate that multiple graphs relatively improve model performance, but in varying degrees depending on datasets. Interpretation from multiple graphs in different views provides more comprehensive features and potential substructures consistent with background knowledge. These results help to understand model decisions and offer valuable insights for subsequent tasks. The concept of multiple molecular graph representations and diverse interpretation perspectives has broad applicability across tasks, architectures, and explanation techniques, enhancing model learning and interpretation for relevant applications in drug discovery.
Collapse
Affiliation(s)
- Apakorn Kengkanna
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, Kanagawa, 226-8501, Japan
| | - Masahito Ohue
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, Kanagawa, 226-8501, Japan.
| |
Collapse
|
3
|
He B, Guo J, Tong HHY, To WM. Artificial Intelligence in Drug Discovery: A Bibliometric Analysis and Literature Review. Mini Rev Med Chem 2024; 24:1353-1367. [PMID: 38243944 DOI: 10.2174/0113895575271267231123160503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2023] [Revised: 09/09/2023] [Accepted: 09/11/2023] [Indexed: 01/22/2024]
Abstract
Drug discovery is a complex and iterative process, making it ideal for using artificial intelligence (AI). This paper uses a bibliometric approach to reveal AI's trend and underlying structure in drug discovery (AIDD). A total of 4310 journal articles and reviews indexed in Scopus were analyzed, revealing that AIDD has been rapidly growing over the past two decades, with a significant increase after 2017. The United States, China, and the United Kingdom were the leading countries in research output, with academic institutions, particularly the Chinese Academy of Sciences and the University of Cambridge, being the most productive. In addition, industrial companies, including both pharmaceutical and high-tech ones, also made significant contributions. Additionally, this paper thoroughly discussed the evolution and research frontiers of AIDD, which were uncovered through co-occurrence analyses of keywords using VOSviewer. Our findings highlight that AIDD is an interdisciplinary and promising research field that has the potential to revolutionize drug discovery. The comprehensive overview provided here will be of significant interest to researchers, practitioners, and policy-makers in related fields. The results emphasize the need for continued investment and collaboration in AIDD to accelerate drug discovery, reduce costs, and improve patient outcomes.
Collapse
Affiliation(s)
- Baoyu He
- Centre for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macao, China
| | - Jingjing Guo
- Centre for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macao, China
| | - Henry H Y Tong
- Centre for Artificial Intelligence Driven Drug Discovery, Faculty of Applied Sciences, Macao Polytechnic University, Macao, China
| | - Wai Ming To
- Faculty of Business, Macao Polytechnic University, Macao, China
| |
Collapse
|
4
|
Zhou H, Fu H, Shao X, Cai W. Binding Thermodynamics of Fourth-Generation EGFR Inhibitors Revealed by Absolute Binding Free Energy Calculations. J Chem Inf Model 2023; 63:7837-7846. [PMID: 38054791 DOI: 10.1021/acs.jcim.3c01636] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
The overexpression or mutation of the kinase domain of the epidermal growth factor receptor (EGFR) is strongly associated with non-small-cell lung cancer (NSCLC). EGFR tyrosine kinase inhibitors (TKIs) have proven to be effective in treating NSCLC patients. However, EGFR mutations can result in drug resistance. To elucidate the mechanisms underlying this resistance and inform future drug development, we examined the binding affinities of BLU-945, a recently reported fourth-generation TKI, to wild-type EGFR (EGFRWT) and its double-mutant (L858R/T790M; EGFRDM) and triple-mutant (L858R/T790M/C797S; EGFRTM) forms. We compared the binding affinities of BLU-945, BLU-945 analogues, CH7233163 (another fourth-generation TKI), and erlotinib (a first-generation TKI) using absolute binding free energy calculations. Our findings reveal that BLU-945 and CH7233163 exhibit binding affinities to both EGFRDM and EGFRTM stronger than those of erlotinib, corroborating experimental data. We identified K745 and T854 as the key residues in the binding of fourth-generation EGFR TKIs. Electrostatic forces were the predominant driving force for the binding of fourth-generation TKIs to EGFR mutants. Furthermore, we discovered that the incorporation of piperidinol and sulfone groups in BLU-945 substantially enhanced its binding capacity to EGFR mutants. Our study offers valuable theoretical insights for optimizing fourth-generation EGFR TKIs.
Collapse
Affiliation(s)
- Huaxin Zhou
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
| | - Haohao Fu
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
- School of Materials Science and Engineering, Smart Sensing Interdisciplinary Science Center, Nankai University, Tianjin 300350, China
| | - Xueguang Shao
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
- School of Materials Science and Engineering, Smart Sensing Interdisciplinary Science Center, Nankai University, Tianjin 300350, China
| | - Wensheng Cai
- Research Center for Analytical Sciences, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, College of Chemistry, Nankai University, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
- School of Materials Science and Engineering, Smart Sensing Interdisciplinary Science Center, Nankai University, Tianjin 300350, China
| |
Collapse
|
5
|
Kou X, Shi P, Gao C, Ma P, Xing H, Ke Q, Zhang D. Data-Driven Elucidation of Flavor Chemistry. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2023; 71:6789-6802. [PMID: 37102791 PMCID: PMC10176570 DOI: 10.1021/acs.jafc.3c00909] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Flavor molecules are commonly used in the food industry to enhance product quality and consumer experiences but are associated with potential human health risks, highlighting the need for safer alternatives. To address these health-associated challenges and promote reasonable application, several databases for flavor molecules have been constructed. However, no existing studies have comprehensively summarized these data resources according to quality, focused fields, and potential gaps. Here, we systematically summarized 25 flavor molecule databases published within the last 20 years and revealed that data inaccessibility, untimely updates, and nonstandard flavor descriptions are the main limitations of current studies. We examined the development of computational approaches (e.g., machine learning and molecular simulation) for the identification of novel flavor molecules and discussed their major challenges regarding throughput, model interpretability, and the lack of gold-standard data sets for equitable model evaluation. Additionally, we discussed future strategies for the mining and designing of novel flavor molecules based on multi-omics and artificial intelligence to provide a new foundation for flavor science research.
Collapse
Affiliation(s)
- Xingran Kou
- Collaborative Innovation Center of Fragrance Flavour and Cosmetics, School of Perfume and Aroma Technology, Shanghai Institute of Technology, Shanghai 201418, China
| | - Peiqin Shi
- Collaborative Innovation Center of Fragrance Flavour and Cosmetics, School of Perfume and Aroma Technology, Shanghai Institute of Technology, Shanghai 201418, China
| | - Chukun Gao
- Laboratory for Physical Chemistry, ETH Zürich, 8093 Zürich, Switzerland
| | - Peihua Ma
- Department of Nutrition and Food Science, University of Maryland, College Park, Maryland 20742, United States
| | - Huadong Xing
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Qinfei Ke
- Collaborative Innovation Center of Fragrance Flavour and Cosmetics, School of Perfume and Aroma Technology, Shanghai Institute of Technology, Shanghai 201418, China
| | - Dachuan Zhang
- National Centre of Competence in Research (NCCR) Catalysis, Institute of Environmental Engineering, ETH Zürich, 8093 Zürich, Switzerland
| |
Collapse
|