1
|
López-Pérez K, Kim TD, Miranda-Quintana RA. iSIM: instant similarity. DIGITAL DISCOVERY 2024; 3:1160-1171. [PMID: 38873032 PMCID: PMC11167700 DOI: 10.1039/d4dd00041b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Accepted: 05/06/2024] [Indexed: 06/15/2024]
Abstract
The quantification of molecular similarity has been present since the beginning of cheminformatics. Although several similarity indices and molecular representations have been reported, all of them ultimately reduce to the calculation of molecular similarities of only two objects at a time. Hence, to obtain the average similarity of a set of molecules, all the pairwise comparisons need to be computed, which demands a quadratic scaling in the number of computational resources. Here we propose an exact alternative to this problem: iSIM (instant similarity). iSIM performs comparisons of multiple molecules at the same time and yields the same value as the average pairwise comparisons of molecules represented by binary fingerprints and real-value descriptors. In this work, we introduce the mathematical framework and several applications of iSIM in chemical sampling, visualization, diversity selection, and clustering.
Collapse
Affiliation(s)
- Kenneth López-Pérez
- Department of Chemistry and Quantum Theory Project, University of Florida Gainesville Florida 32611 USA
| | - Taewon D Kim
- Department of Chemistry and Quantum Theory Project, University of Florida Gainesville Florida 32611 USA
| | | |
Collapse
|
2
|
Chemical space visualization: transforming multidimensional chemical spaces into similarity-based molecular networks. Future Med Chem 2016; 8:1769-78. [PMID: 27572425 DOI: 10.4155/fmc-2016-0023] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND The concept of chemical space is of fundamental relevance for medicinal chemistry and chemical informatics. Multidimensional chemical space representations are coordinate-based. Chemical space networks (CSNs) have been introduced as a coordinate-free representation. RESULTS A computational approach is presented for the transformation of multidimensional chemical space into CSNs. The design of transformation CSNs (TRANS-CSNs) is based upon a similarity function that directly reflects distance relationships in original multidimensional space. CONCLUSION TRANS-CSNs provide an immediate visualization of coordinate-based chemical space and do not require the use of dimensionality reduction techniques. At low network density, TRANS-CSNs are readily interpretable and make it possible to evaluate structure-activity relationship information originating from multidimensional chemical space.
Collapse
|
3
|
Lessons learned from the design of chemical space networks and opportunities for new applications. J Comput Aided Mol Des 2016; 30:191-208. [DOI: 10.1007/s10822-016-9906-3] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2016] [Accepted: 03/01/2016] [Indexed: 12/13/2022]
|
4
|
Design of chemical space networks on the basis of Tversky similarity. J Comput Aided Mol Des 2015; 30:1-12. [DOI: 10.1007/s10822-015-9891-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Accepted: 12/18/2015] [Indexed: 12/28/2022]
|
5
|
Zhang B, Vogt M, Maggiora GM, Bajorath J. Design of chemical space networks using a Tanimoto similarity variant based upon maximum common substructures. J Comput Aided Mol Des 2015; 29:937-50. [DOI: 10.1007/s10822-015-9872-1] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2015] [Accepted: 09/24/2015] [Indexed: 12/14/2022]
|
6
|
Zhang B, Vogt M, Maggiora GM, Bajorath J. Comparison of bioactive chemical space networks generated using substructure- and fingerprint-based measures of molecular similarity. J Comput Aided Mol Des 2015; 29:595-608. [DOI: 10.1007/s10822-015-9852-5] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2015] [Accepted: 06/03/2015] [Indexed: 12/20/2022]
|
7
|
Sukumar N, Krein MP, Prabhu G, Bhattacharya S, Sen S. Network measures for chemical library design. Drug Dev Res 2015; 75:402-11. [PMID: 25195584 DOI: 10.1002/ddr.21218] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
In this overview, we examine recent developments in network approaches to drug design. A brief overview of networks is followed by a discussion of how chemical similarity networks and their properties address challenges in drug design. Multiple methods used to assess or enhance chemical diversity for early-stage drug discovery are discussed, as well as methods that can be used for drug repositioning and ligand polypharmacology.
Collapse
Affiliation(s)
- Nagamani Sukumar
- Department of Chemistry, Shiv Nadar University, Dadri, Gautam Budh Nagar, U.P., 201314, India; Center for Informatics, Shiv Nadar University, Dadri, Gautam Budh Nagar, U.P., 201314, India
| | | | | | | | | |
Collapse
|
8
|
Design and characterization of chemical space networks for different compound data sets. J Comput Aided Mol Des 2014; 29:113-25. [DOI: 10.1007/s10822-014-9821-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2014] [Accepted: 11/27/2014] [Indexed: 01/23/2023]
|
9
|
Kuyoc-Carrillo VF, Medina-Franco JL. Progress in the Analysis of Multiple Activity Profile of Screening Data Using Computational Approaches. Drug Dev Res 2014; 75:313-23. [DOI: 10.1002/ddr.21209] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|
10
|
Cheng T, Pan Y, Hao M, Wang Y, Bryant SH. PubChem applications in drug discovery: a bibliometric analysis. Drug Discov Today 2014; 19:1751-1756. [PMID: 25168772 DOI: 10.1016/j.drudis.2014.08.008] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2014] [Revised: 07/17/2014] [Accepted: 08/18/2014] [Indexed: 12/18/2022]
Abstract
A bibliometric analysis of PubChem applications is presented by reviewing 1132 research articles. The massive volume of chemical structure and bioactivity data in PubChem and its online services have been used globally in various fields including chemical biology, medicinal chemistry and informatics research. PubChem supports drug discovery in many aspects such as lead identification and optimization, compound-target profiling, polypharmacology studies and unknown chemical identity elucidation. PubChem has also become a valuable resource for developing secondary databases, informatics tools and web services. The growing PubChem resource with its public availability offers support and great opportunities for the interrogation of pharmacological mechanisms and the genetic basis of diseases, which are vital for drug innovation and repurposing.
Collapse
Affiliation(s)
- Tiejun Cheng
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Yongmei Pan
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Ming Hao
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Yanli Wang
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA.
| | - Stephen H Bryant
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA.
| |
Collapse
|
11
|
Affiliation(s)
- Dagmar Stumpfe
- Department of Life Science Informatics, B-IT; LIMES Program Unit Chemical Biology and Medicinal Chemistry; Rheinische Friedrich-Wilhelms-Universität Bonn; Bonn D-53113 Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT; LIMES Program Unit Chemical Biology and Medicinal Chemistry; Rheinische Friedrich-Wilhelms-Universität Bonn; Bonn D-53113 Germany
| |
Collapse
|
12
|
Chemical space networks: a powerful new paradigm for the description of chemical space. J Comput Aided Mol Des 2014; 28:795-802. [PMID: 24925682 DOI: 10.1007/s10822-014-9760-0] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2014] [Accepted: 06/04/2014] [Indexed: 01/26/2023]
Abstract
The concept of chemical space is playing an increasingly important role in many areas of chemical research, especially medicinal chemistry and chemical biology. It is generally conceived as consisting of numerous compound clusters of varying sizes scattered throughout the space in much the same way as galaxies of stars inhabit our universe. A number of issues associated with this coordinate-based representation are discussed. Not the least of which is the continuous nature of the space, a feature not entirely compatible with the inherently discrete nature of chemical space. Cell-based representations, which are derived from coordinate-based spaces, have also been developed that facilitate a number of chemical informatic activities (e.g., diverse subset selection, filling 'diversity voids', and comparing compound collections).These representations generally suffer the 'curse of dimensionality'. In this work, networks are proposed as an attractive paradigm for representing chemical space since they circumvent many of the issues associated with coordinate- and cell-based representations, including the curse of dimensionality. In addition, their relational structure is entirely compatible with the intrinsic nature of chemical space. A description of the features of these chemical space networks is presented that emphasizes their statistical characteristics and indicates how they are related to various types of network topologies that exhibit random, scale-free, and/or 'small world' properties.
Collapse
|
13
|
Csermely P, Korcsmáros T, Kiss HJM, London G, Nussinov R. Structure and dynamics of molecular networks: a novel paradigm of drug discovery: a comprehensive review. Pharmacol Ther 2013; 138:333-408. [PMID: 23384594 PMCID: PMC3647006 DOI: 10.1016/j.pharmthera.2013.01.016] [Citation(s) in RCA: 506] [Impact Index Per Article: 46.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2013] [Accepted: 01/22/2013] [Indexed: 02/02/2023]
Abstract
Despite considerable progress in genome- and proteome-based high-throughput screening methods and in rational drug design, the increase in approved drugs in the past decade did not match the increase of drug development costs. Network description and analysis not only give a systems-level understanding of drug action and disease complexity, but can also help to improve the efficiency of drug design. We give a comprehensive assessment of the analytical tools of network topology and dynamics. The state-of-the-art use of chemical similarity, protein structure, protein-protein interaction, signaling, genetic interaction and metabolic networks in the discovery of drug targets is summarized. We propose that network targeting follows two basic strategies. The "central hit strategy" selectively targets central nodes/edges of the flexible networks of infectious agents or cancer cells to kill them. The "network influence strategy" works against other diseases, where an efficient reconfiguration of rigid networks needs to be achieved by targeting the neighbors of central nodes/edges. It is shown how network techniques can help in the identification of single-target, edgetic, multi-target and allo-network drug target candidates. We review the recent boom in network methods helping hit identification, lead selection optimizing drug efficacy, as well as minimizing side-effects and drug toxicity. Successful network-based drug development strategies are shown through the examples of infections, cancer, metabolic diseases, neurodegenerative diseases and aging. Summarizing >1200 references we suggest an optimized protocol of network-aided drug development, and provide a list of systems-level hallmarks of drug quality. Finally, we highlight network-related drug development trends helping to achieve these hallmarks by a cohesive, global approach.
Collapse
Affiliation(s)
- Peter Csermely
- Department of Medical Chemistry, Semmelweis University, P.O. Box 260, H-1444 Budapest 8, Hungary.
| | | | | | | | | |
Collapse
|
14
|
Graphs and networks in chemical and biological informatics: past, present and future. Future Med Chem 2012; 4:2039-47. [DOI: 10.4155/fmc.12.128] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Chemical and biological network analysis has recently garnered intense interest from the perspective of drug design and discovery. While graph theoretic concepts have a long history in chemistry – predating quantum mechanics – and graphical measures of chemical structures date back to the 1970s, it is only recently with the advent of public repositories of information and availability of high-throughput assays and computational resources that network analysis of large-scale chemical networks, such as protein–protein interaction networks, has become possible. Drug design and discovery are undergoing a paradigm shift, from the notion of ‘one target, one drug’ to a much more nuanced view that relies on multiple sources of information: genomic, proteomic, metabolomic and so on. This holistic view of drug design is an incredibly daunting undertaking still very much in its infancy. Here, we focus on current developments in graph- and network-centric approaches in chemical and biological informatics, with particular reference to applications in the fields of SAR modeling and drug design. Key insights from the past suggest a path forward via visualization and fusion of multiple sources of chemical network data.
Collapse
|