Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Ralaivola L, Swamidass SJ, Saigo H, Baldi P. Graph kernels for chemical informatics. Neural Netw 2005;18:1093-110. [PMID: 16157471 DOI: 10.1016/j.neunet.2005.07.009] [Citation(s) in RCA: 188] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]

For:	Ralaivola L, Swamidass SJ, Saigo H, Baldi P. Graph kernels for chemical informatics. Neural Netw 2005;18:1093-110. [PMID: 16157471 DOI: 10.1016/j.neunet.2005.07.009] [Citation(s) in RCA: 188] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]

Number

Cited by Other Article(s)

Baidya AT, Dante D, Das B, Wang L, Darreh-Shori T, Kumar R. Discovery and characterization of novel pyridone and furan substituted ligands of choline acetyltransferase. Eur J Pharmacol 2025;998:177638. [PMID: 40252901 DOI: 10.1016/j.ejphar.2025.177638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2025] [Revised: 04/16/2025] [Accepted: 04/16/2025] [Indexed: 04/21/2025]

Lamens A, Bajorath J. Comparing Explanations of Molecular Machine Learning Models Generated with Different Methods for the Calculation of Shapley Values. Mol Inform 2025;44:e202500067. [PMID: 40112199 PMCID: PMC11925390 DOI: 10.1002/minf.202500067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2025] [Revised: 03/04/2025] [Accepted: 03/06/2025] [Indexed: 03/22/2025]

Xerxa E, Vogt M, Bajorath J. Influence of Data Curation and Confidence Levels on Compound Predictions Using Machine Learning Models. J Chem Inf Model 2024;64:9341-9349. [PMID: 39656869 DOI: 10.1021/acs.jcim.4c01573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2024]

Jia L, Brémond É, Zaida L, Gaüzère B, Tognetti V, Joubert L. Predicting redox potentials by graph-based machine learning methods. J Comput Chem 2024;45:2383-2396. [PMID: 38923574 DOI: 10.1002/jcc.27380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 03/25/2024] [Accepted: 04/19/2024] [Indexed: 06/28/2024]

Yao Y, Oberhofer H. Designing building blocks of covalent organic frameworks through on-the-fly batch-based Bayesian optimization. J Chem Phys 2024;161:074102. [PMID: 39145552 DOI: 10.1063/5.0223540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2024] [Accepted: 07/30/2024] [Indexed: 08/16/2024] Open

Venkatraman V, Gaiser J, Demekas D, Roy A, Xiong R, Wheeler TJ. Do Molecular Fingerprints Identify Diverse Active Drugs in Large-Scale Virtual Screening? (No). Pharmaceuticals (Basel) 2024;17:992. [PMID: 39204097 PMCID: PMC11356940 DOI: 10.3390/ph17080992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2024] [Revised: 07/18/2024] [Accepted: 07/23/2024] [Indexed: 09/03/2024] Open

Maeda I, Tamura S, Ogura Y, Serizawa T, Shimada T, Kunimoto R, Miyao T. Scaffold-Hopped Compound Identification by Ligand-Based Approaches with a Prospective Affinity Test. J Chem Inf Model 2024;64:5557-5569. [PMID: 38950192 PMCID: PMC11267578 DOI: 10.1021/acs.jcim.4c00342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 06/05/2024] [Accepted: 06/18/2024] [Indexed: 07/03/2024]

Li J, Wang J, Wu L, Wang X, Luo X, Xu Y. AMHGCN: Adaptive multi-level hypergraph convolution network for human motion prediction. Neural Netw 2024;172:106153. [PMID: 38306784 DOI: 10.1016/j.neunet.2024.106153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 11/20/2023] [Accepted: 01/28/2024] [Indexed: 02/04/2024]

Abstract

Human motion prediction is the key technology for many real-life applications, e.g., self-driving and human-robot interaction. The recent approaches adopt the unrestricted full-connection graph representation to capture the relationships inside the human skeleton. However, there are two issues to be solved: (i) these unrestricted full-connection graph representation methods neglect the inherent dependencies across the joints of the human body; (ii) these methods represent human motions using the features extracted from a single level and thus can neither fully exploit the various connection relationships among the human body nor guarantee the human motion prediction results to be reasonable. To tackle the above issues, we propose an adaptive multi-level hypergraph convolution network (AMHGCN), which uses the adaptive multi-level hypergraph representation to capture various dependencies among the human body. Our method has four different levels of hypergraph representations, including (i) the joint-level hypergraph representation to capture inherent kinetic dependencies in the human body, (ii) the part-level hypergraph representation to exploit the kinetic characteristics at a higher level (in comparison to the joint-level) by viewing some part of the human body as an entirety, (iii) the component-level hypergraph representation to model the semantic information, and (iv) the global-level hypergraph representation to extract long-distance dependencies in the human body. In addition, to take full advantage of the knowledge carried in the training data, we propose a reverse loss (i.e., adopting the future human poses to predict the historical poses reversely) to realize data augmentation. Extensive experiments show that our proposed AMHGCN can achieve state-of-the-art performance on three benchmarks, i.e., Human3.6M, CMU-Mocap, and 3DPW.

Collapse

Boldini D, Ballabio D, Consonni V, Todeschini R, Grisoni F, Sieber SA. Effectiveness of molecular fingerprints for exploring the chemical space of natural products. J Cheminform 2024;16:35. [PMID: 38528548 DOI: 10.1186/s13321-024-00830-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 03/17/2024] [Indexed: 03/27/2024] Open

Abstract

Natural products are a diverse class of compounds with promising biological properties, such as high potency and excellent selectivity. However, they have different structural motifs than typical drug-like compounds, e.g., a wider range of molecular weight, multiple stereocenters and higher fraction of sp3-hybridized carbons. This makes the encoding of natural products via molecular fingerprints difficult, thus restricting their use in cheminformatics studies. To tackle this issue, we explored over 30 years of research to systematically evaluate which molecular fingerprint provides the best performance on the natural product chemical space. We considered 20 molecular fingerprints from four different sources, which we then benchmarked on over 100,000 unique natural products from the COCONUT (COlleCtion of Open Natural prodUcTs) and CMNPD (Comprehensive Marine Natural Products Database) databases. Our analysis focused on the correlation between different fingerprints and their classification performance on 12 bioactivity prediction datasets. Our results show that different encodings can provide fundamentally different views of the natural product chemical space, leading to substantial differences in pairwise similarity and performance. While Extended Connectivity Fingerprints are the de-facto option to encoding drug-like compounds, other fingerprints resulted to match or outperform them for bioactivity prediction of natural products. These results highlight the need to evaluate multiple fingerprinting algorithms for optimal performance and suggest new areas of research. Finally, we provide an open-source Python package for computing all molecular fingerprints considered in the study, as well as data and scripts necessary to reproduce the results, at https://github.com/dahvida/NP_Fingerprints .

Collapse

Lamens A, Bajorath J. Generation of Molecular Counterfactuals for Explainable Machine Learning Based on Core-Substituent Recombination. ChemMedChem 2024;19:e202300586. [PMID: 37983655 DOI: 10.1002/cmdc.202300586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 11/20/2023] [Accepted: 11/20/2023] [Indexed: 11/22/2023]

Colliandre L, Muller C. Bayesian Optimization in Drug Discovery. Methods Mol Biol 2024;2716:101-136. [PMID: 37702937 DOI: 10.1007/978-1-0716-3449-3_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/14/2023]

Galati S, Di Stefano M, Bertini S, Granchi C, Giordano A, Gado F, Macchia M, Tuccinardi T, Poli G. Identification of New GSK3β Inhibitors through a Consensus Machine Learning-Based Virtual Screening. Int J Mol Sci 2023;24:17233. [PMID: 38139062 PMCID: PMC10743990 DOI: 10.3390/ijms242417233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 12/05/2023] [Accepted: 12/06/2023] [Indexed: 12/24/2023] Open

Janela T, Bajorath J. Anatomy of Potency Predictions Focusing on Structural Analogues with Increasing Potency Differences Including Activity Cliffs. J Chem Inf Model 2023;63:7032-7044. [PMID: 37943257 DOI: 10.1021/acs.jcim.3c01530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2023]

Mastropietro A, Feldmann C, Bajorath J. Calculation of exact Shapley values for explaining support vector machine models using the radial basis function kernel. Sci Rep 2023;13:19561. [PMID: 37949930 PMCID: PMC10638308 DOI: 10.1038/s41598-023-46930-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 11/07/2023] [Indexed: 11/12/2023] Open

Janela T, Bajorath J. Rationalizing general limitations in assessing and comparing methods for compound potency prediction. Sci Rep 2023;13:17816. [PMID: 37857835 PMCID: PMC10587074 DOI: 10.1038/s41598-023-45086-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 10/16/2023] [Indexed: 10/21/2023] Open

Siemers FM, Bajorath J. Differences in learning characteristics between support vector machine and random forest models for compound classification revealed by Shapley value analysis. Sci Rep 2023;13:5983. [PMID: 37045972 PMCID: PMC10097675 DOI: 10.1038/s41598-023-33215-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 04/09/2023] [Indexed: 04/14/2023] Open

Janela T, Bajorath J. Large-Scale Predictions of Compound Potency with Original and Modified Activity Classes Reveal General Prediction Characteristics and Intrinsic Limitations of Conventional Benchmarking Calculations. Pharmaceuticals (Basel) 2023;16:ph16040530. [PMID: 37111287 PMCID: PMC10143224 DOI: 10.3390/ph16040530] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 03/27/2023] [Accepted: 03/31/2023] [Indexed: 04/05/2023] Open

Josephs N, Lin L, Rosenberg S, Kolaczyk ED. Bayesian classification, anomaly detection, and survival analysis using network inputs with application to the microbiome. Ann Appl Stat 2023. [DOI: 10.1214/22-aoas1623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]

Predicting Potent Compounds Using a Conditional Variational Autoencoder Based upon a New Structure-Potency Fingerprint. Biomolecules 2023;13:biom13020393. [PMID: 36830761 PMCID: PMC9953226 DOI: 10.3390/biom13020393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 02/07/2023] [Accepted: 02/16/2023] [Indexed: 02/22/2023] Open

Lungu CN, Mangalagiu V, Mangalagiu II, Mehedinti MC. Benzoquinoline Chemical Space: A Helpful Approach in Antibacterial and Anticancer Drug Design. Molecules 2023;28:molecules28031069. [PMID: 36770739 PMCID: PMC9921191 DOI: 10.3390/molecules28031069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 01/09/2023] [Accepted: 01/16/2023] [Indexed: 01/24/2023] Open

Tamura S, Miyao T, Bajorath J. Large-scale prediction of activity cliffs using machine and deep learning methods of increasing complexity. J Cheminform 2023;15:4. [PMID: 36611204 PMCID: PMC9825040 DOI: 10.1186/s13321-022-00676-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 12/23/2022] [Indexed: 01/09/2023] Open

Abstract

Activity cliffs (AC) are formed by pairs of structural analogues that are active against the same target but have a large difference in potency. While much of our knowledge about ACs has originated from the analysis and comparison of compounds and activity data, several studies have reported AC predictions over the past decade. Different from typical compound classification tasks, AC predictions must be carried out at the level of compound pairs representing ACs or nonACs. Most AC predictions reported so far have focused on individual methods or comparisons of two or three approaches and only investigated a few compound activity classes (from 2 to 10). Although promising prediction accuracy has been reported in most cases, different system set-ups, AC definitions, methods, and calculation conditions were used, precluding direct comparisons of these studies. Therefore, we have carried out a large-scale AC prediction campaign across 100 activity classes comparing machine learning methods of greatly varying complexity, ranging from pair-based nearest neighbor classifiers and decision tree or kernel methods to deep neural networks. The results of our systematic predictions revealed the level of accuracy that can be expected for AC predictions across many different compound classes. In addition, prediction accuracy did not scale with methodological complexity but was significantly influenced by memorization of compounds shared by different ACs or nonACs. In many instances, limited training data were sufficient for building accurate models using different methods and there was no detectable advantage of deep learning over simpler approaches for AC prediction. On a global scale, support vector machine models performed best, by only small margins compared to others including simple nearest neighbor classifiers.

Collapse

Detecting the modality of a medical image using visual and textual features. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Sundin I, Voronov A, Xiao H, Papadopoulos K, Bjerrum EJ, Heinonen M, Patronov A, Kaski S, Engkvist O. Human-in-the-loop assisted de novo molecular design. J Cheminform 2022;14:86. [PMID: 36578043 PMCID: PMC9795720 DOI: 10.1186/s13321-022-00667-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 12/03/2022] [Indexed: 12/29/2022] Open

Joint structural annotation of small molecules using liquid chromatography retention order and tandem mass spectrometry data. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00577-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Simple nearest-neighbour analysis meets the accuracy of compound potency predictions using complex machine learning models. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00581-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]

Griffiths RR, Greenfield JL, Thawani AR, Jamasb AR, Moss HB, Bourached A, Jones P, McCorkindale W, Aldrick AA, Fuchter MJ, Lee AA. Data-driven discovery of molecular photoswitches with multioutput Gaussian processes. Chem Sci 2022;13:13541-13551. [PMID: 36507171 PMCID: PMC9682911 DOI: 10.1039/d2sc04306h] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Accepted: 09/16/2022] [Indexed: 11/11/2022] Open

Machine Learning-Based Virtual Screening for the Identification of Cdk5 Inhibitors. Int J Mol Sci 2022;23:ijms231810653. [PMID: 36142566 PMCID: PMC9502400 DOI: 10.3390/ijms231810653] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 09/07/2022] [Accepted: 09/09/2022] [Indexed: 12/04/2022] Open

Baranwal M, Magner A, Saldinger J, Turali-Emre ES, Elvati P, Kozarekar S, VanEpps JS, Kotov NA, Violi A, Hero AO. Struct2Graph: a graph attention network for structure based predictions of protein-protein interactions. BMC Bioinformatics 2022;23:370. [PMID: 36088285 PMCID: PMC9464414 DOI: 10.1186/s12859-022-04910-9] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 08/26/2022] [Indexed: 12/03/2022] Open

Abstract

BACKGROUND

Development of new methods for analysis of protein-protein interactions (PPIs) at molecular and nanometer scales gives insights into intracellular signaling pathways and will improve understanding of protein functions, as well as other nanoscale structures of biological and abiological origins. Recent advances in computational tools, particularly the ones involving modern deep learning algorithms, have been shown to complement experimental approaches for describing and rationalizing PPIs. However, most of the existing works on PPI predictions use protein-sequence information, and thus have difficulties in accounting for the three-dimensional organization of the protein chains.

RESULTS

In this study, we address this problem and describe a PPI analysis based on a graph attention network, named Struct2Graph, for identifying PPIs directly from the structural data of folded protein globules. Our method is capable of predicting the PPI with an accuracy of 98.89% on the balanced set consisting of an equal number of positive and negative pairs. On the unbalanced set with the ratio of 1:10 between positive and negative pairs, Struct2Graph achieves a fivefold cross validation average accuracy of 99.42%. Moreover, Struct2Graph can potentially identify residues that likely contribute to the formation of the protein-protein complex. The identification of important residues is tested for two different interaction types: (a) Proteins with multiple ligands competing for the same binding area, (b) Dynamic protein-protein adhesion interaction. Struct2Graph identifies interacting residues with 30% sensitivity, 89% specificity, and 87% accuracy.

CONCLUSIONS

In this manuscript, we address the problem of prediction of PPIs using a first of its kind, 3D-structure-based graph attention network (code available at https://github.com/baranwa2/Struct2Graph ). Furthermore, the novel mutual attention mechanism provides insights into likely interaction sites through its unsupervised knowledge selection process. This study demonstrates that a relatively low-dimensional feature embedding learned from graph structures of individual proteins outperforms other modern machine learning classifiers based on global protein features. In addition, through the analysis of single amino acid variations, the attention mechanism shows preference for disease-causing residue variations over benign polymorphisms, demonstrating that it is not limited to interface residues.

Collapse

Wang Z, Cao Q, Shen H, Xu B, Cen K, Cheng X. Location-aware convolutional neural networks for graph classification. Neural Netw 2022;155:74-83. [PMID: 36041282 DOI: 10.1016/j.neunet.2022.07.035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 06/06/2022] [Accepted: 07/30/2022] [Indexed: 11/25/2022]

García-Ortegón M, Simm GNC, Tripp AJ, Hernández-Lobato JM, Bender A, Bacallado S. DOCKSTRING: Easy Molecular Docking Yields Better Benchmarks for Ligand Design. J Chem Inf Model 2022;62:3486-3502. [PMID: 35849793 PMCID: PMC9364321 DOI: 10.1021/acs.jcim.1c01334] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Indexed: 01/05/2023]

Asahara R, Miyao T. Extended Connectivity Fingerprints as a Chemical Reaction Representation for Enantioselective Organophosphorus-Catalyzed Asymmetric Reaction Prediction. ACS OMEGA 2022;7:26952-26964. [PMID: 35936487 PMCID: PMC9352214 DOI: 10.1021/acsomega.2c03812] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/18/2022] [Accepted: 07/07/2022] [Indexed: 06/15/2023]

Feldmann C, Bajorath J. Calculation of Exact Shapley Values for Support Vector Machines with Tanimoto Kernel Enables Model Interpretation. iScience 2022;25:105023. [PMID: 36105596 PMCID: PMC9464958 DOI: 10.1016/j.isci.2022.105023] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Revised: 08/09/2022] [Accepted: 08/20/2022] [Indexed: 11/24/2022] Open

Yang P, Henle EA, Fern XZ, Simon CM. Classifying the toxicity of pesticides to honey bees via support vector machines with random walk graph kernels. J Chem Phys 2022;157:034102. [DOI: 10.1063/5.0090573] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open

Multi-task convolutional neural networks for predicting in vitro clearance endpoints from molecular images. J Comput Aided Mol Des 2022;36:443-457. [PMID: 35618861 DOI: 10.1007/s10822-022-00458-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 05/04/2022] [Indexed: 10/18/2022]

Janela T, Takeuchi K, Bajorath J. Introducing a Chemically Intuitive Core-Substituent Fingerprint Designed to Explore Structural Requirements for Effective Similarity Searching and Machine Learning. MOLECULES (BASEL, SWITZERLAND) 2022;27:molecules27072331. [PMID: 35408730 PMCID: PMC9000322 DOI: 10.3390/molecules27072331] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 03/29/2022] [Accepted: 04/01/2022] [Indexed: 11/16/2022]

Abstract

Fingerprint (FP) representations of chemical structure continue to be one of the most widely used types of molecular descriptors in chemoinformatics and computational medicinal chemistry. One often distinguishes between two- and three-dimensional (2D and 3D) FPs depending on whether they are derived from molecular graphs or conformations, respectively. Primary application areas for FPs include similarity searching and compound classification via machine learning, especially for hit identification. For these applications, 2D FPs are particularly popular, given their robustness and for the most part comparable (or better) performance to 3D FPs. While a variety of FP prototypes has been designed and evaluated during earlier times of chemoinformatics research, new developments have been rare over the past decade. At least in part, this has been due to the situation that topological (atom environment) FPs derived from molecular graphs have evolved as a gold standard in the field. We were interested in exploring the question of whether the amount of structural information captured by state-of-the-art 2D FPs is indeed required for effective similarity searching and compound classification or whether accounting for fewer structural features might be sufficient. Therefore, pursuing a "structural minimalist" approach, we designed and implemented a new 2D FP based upon ring and substituent fragments obtained by systematically decomposing large numbers of compounds from medicinal chemistry. The resulting FP termed core-substituent FP (CSFP) captures much smaller numbers of structural features than state-of-the-art 2D FPs. However, CSFP achieves high performance in similarity searching and machine learning, demonstrating that less structural information is required for establishing molecular similarity relationships than is often believed. Given its high performance and chemical tangibility, CSFP is also relevant for practical applications in medicinal chemistry.

Collapse

Ligand-based approaches to activity prediction for the early stage of structure–activity–relationship progression. J Comput Aided Mol Des 2022;36:237-252. [DOI: 10.1007/s10822-022-00449-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Accepted: 03/07/2022] [Indexed: 11/27/2022]

Rodríguez-Pérez R, Bajorath J. Evolution of Support Vector Machine and Regression Modeling in Chemoinformatics and Drug Discovery. J Comput Aided Mol Des 2022;36:355-362. [PMID: 35304657 PMCID: PMC9325859 DOI: 10.1007/s10822-022-00442-9] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 02/15/2022] [Indexed: 11/05/2022]

Capecchi A, Reymond JL. Classifying natural products from plants, fungi or bacteria using the COCONUT database and machine learning. J Cheminform 2021;13:82. [PMID: 34663470 PMCID: PMC8524952 DOI: 10.1186/s13321-021-00559-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Accepted: 10/02/2021] [Indexed: 01/13/2023] Open

Tamura S, Jasial S, Miyao T, Funatsu K. Interpretation of Ligand-Based Activity Cliff Prediction Models Using the Matched Molecular Pair Kernel. Molecules 2021;26:molecules26164916. [PMID: 34443503 PMCID: PMC8401777 DOI: 10.3390/molecules26164916] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 08/09/2021] [Accepted: 08/10/2021] [Indexed: 11/16/2022] Open

Bach E, Rogers S, Williamson J, Rousu J. Probabilistic framework for integration of mass spectrum and retention time information in small molecule identification. Bioinformatics 2021;37:1724-1731. [PMID: 33244585 PMCID: PMC8289373 DOI: 10.1093/bioinformatics/btaa998] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Revised: 10/27/2020] [Accepted: 11/17/2020] [Indexed: 11/14/2022] Open

Safizadeh H, Simpkins SW, Nelson J, Li SC, Piotrowski JS, Yoshimura M, Yashiroda Y, Hirano H, Osada H, Yoshida M, Boone C, Myers CL. Improving Measures of Chemical Structural Similarity Using Machine Learning on Chemical-Genetic Interactions. J Chem Inf Model 2021;61:4156-4172. [PMID: 34318674 PMCID: PMC8479812 DOI: 10.1021/acs.jcim.0c00993] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Affiliation(s)

Hamid Safizadeh Department of Electrical and Computer Engineering, University of Minnesota-Twin Cities, Minneapolis, Minnesota 55455, United States.,Department of Computer Science and Engineering, University of Minnesota-Twin Cities, Minneapolis, Minnesota 55455, United States
Scott W Simpkins Bioinformatics and Computational Biology Graduate Program, University of Minnesota-Twin Cities, Minneapolis, Minnesota 55455, United States
Justin Nelson Bioinformatics and Computational Biology Graduate Program, University of Minnesota-Twin Cities, Minneapolis, Minnesota 55455, United States
Sheena C Li The Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada.,RIKEN Center for Sustainable Resource Science (CSRS), Wako, Saitama 351-0198, Japan
Jeff S Piotrowski RIKEN Center for Sustainable Resource Science (CSRS), Wako, Saitama 351-0198, Japan
Mami Yoshimura RIKEN Center for Sustainable Resource Science (CSRS), Wako, Saitama 351-0198, Japan
Yoko Yashiroda RIKEN Center for Sustainable Resource Science (CSRS), Wako, Saitama 351-0198, Japan
Hiroyuki Hirano RIKEN Center for Sustainable Resource Science (CSRS), Wako, Saitama 351-0198, Japan
Hiroyuki Osada RIKEN Center for Sustainable Resource Science (CSRS), Wako, Saitama 351-0198, Japan
Minoru Yoshida RIKEN Center for Sustainable Resource Science (CSRS), Wako, Saitama 351-0198, Japan.,Department of Biotechnology and Collaborative Research Institute for Innovative Microbiology, The University of Tokyo, Bunkyo City, Tokyo 113-8654, Japan
Charles Boone The Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada.,Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada.,RIKEN Center for Sustainable Resource Science (CSRS), Wako, Saitama 351-0198, Japan
Chad L Myers Department of Computer Science and Engineering, University of Minnesota-Twin Cities, Minneapolis, Minnesota 55455, United States.,Bioinformatics and Computational Biology Graduate Program, University of Minnesota-Twin Cities, Minneapolis, Minnesota 55455, United States

Collapse

Dash T, Srinivasan A, Vig L. Incorporating symbolic domain knowledge into graph neural networks. Mach Learn 2021. [DOI: 10.1007/s10994-021-05966-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Casier B, Chagas da Silva M, Badawi M, Pascale F, Bučko T, Lebègue S, Rocca D. Hybrid localized graph kernel for machine learning energy-related properties of molecules and solids. J Comput Chem 2021;42:1390-1401. [PMID: 34009668 DOI: 10.1002/jcc.26550] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Revised: 04/07/2021] [Accepted: 04/21/2021] [Indexed: 11/10/2022]

Errica F, Giulini M, Bacciu D, Menichetti R, Micheli A, Potestio R. A Deep Graph Network-Enhanced Sampling Approach to Efficiently Explore the Space of Reduced Representations of Proteins. Front Mol Biosci 2021;8:637396. [PMID: 33996896 PMCID: PMC8116519 DOI: 10.3389/fmolb.2021.637396] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Accepted: 02/17/2021] [Indexed: 12/12/2022] Open

Kunkel C, Margraf JT, Chen K, Oberhofer H, Reuter K. Active discovery of organic semiconductors. Nat Commun 2021;12:2422. [PMID: 33893287 PMCID: PMC8065160 DOI: 10.1038/s41467-021-22611-4] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Accepted: 03/15/2021] [Indexed: 01/16/2023] Open

Jia L, Gaüzère B, Honeine P. graphkit-learn: A Python library for graph kernels based on linear patterns. Pattern Recognit Lett 2021. [DOI: 10.1016/j.patrec.2021.01.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Galati S, Yonchev D, Rodríguez-Pérez R, Vogt M, Tuccinardi T, Bajorath J. Predicting Isoform-Selective Carbonic Anhydrase Inhibitors via Machine Learning and Rationalizing Structural Features Important for Selectivity. ACS OMEGA 2021;6:4080-4089. [PMID: 33585783 PMCID: PMC7876851 DOI: 10.1021/acsomega.0c06153] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Accepted: 01/14/2021] [Indexed: 05/03/2023]

Shibayama S, Funatsu K. Industrial Case Study: Identification of Important Substructures and Exploration of Monomers for the Rapid Design of Novel Network Polymers with Distributed Representation. BULLETIN OF THE CHEMICAL SOCIETY OF JAPAN 2021. [DOI: 10.1246/bcsj.20200220] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Blaschke T, Feldmann C, Bajorath J. Prediction of Promiscuity Cliffs Using Machine Learning. Mol Inform 2021;40:e2000196. [PMID: 32881355 PMCID: PMC7816223 DOI: 10.1002/minf.202000196] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Accepted: 09/03/2020] [Indexed: 12/22/2022]

Yonchev D, Bajorath J. DeepCOMO: from structure-activity relationship diagnostics to generative molecular design using the compound optimization monitor methodology. J Comput Aided Mol Des 2020;34:1207-1218. [PMID: 33015739 PMCID: PMC7595974 DOI: 10.1007/s10822-020-00349-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Accepted: 09/29/2020] [Indexed: 11/26/2022]