1
|
Ma M, Huang M, He Y, Fang J, Li J, Li X, Liu M, Zhou M, Cui G, Fan Q. Network Medicine: A Potential Approach for Virtual Drug Screening. Pharmaceuticals (Basel) 2024; 17:899. [PMID: 39065749 PMCID: PMC11280361 DOI: 10.3390/ph17070899] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Revised: 06/27/2024] [Accepted: 07/04/2024] [Indexed: 07/28/2024] Open
Abstract
Traditional drug screening methods typically focus on a single protein target and exhibit limited efficiency due to the multifactorial nature of most diseases, which result from disturbances within complex networks of protein-protein interactions rather than single gene abnormalities. Addressing this limitation requires a comprehensive drug screening strategy. Network medicine is rooted in systems biology and provides a comprehensive framework for understanding disease mechanisms, prevention, and therapeutic innovations. This approach not only explores the associations between various diseases but also quantifies the relationships between disease genes and drug targets within interactome networks, thus facilitating the prediction of drug-disease relationships and enabling the screening of therapeutic drugs for specific complex diseases. An increasing body of research supports the efficiency and utility of network-based strategies in drug screening. This review highlights the transformative potential of network medicine in virtual therapeutic screening for complex diseases, offering novel insights and a robust foundation for future drug discovery endeavors.
Collapse
Affiliation(s)
- Mingxuan Ma
- School of Bioengineering, Zhuhai Campus of Zunyi Medical University, Zhuhai 519000, China; (M.M.); (M.H.); (Y.H.); (J.L.); (M.L.); (M.Z.)
| | - Mei Huang
- School of Bioengineering, Zhuhai Campus of Zunyi Medical University, Zhuhai 519000, China; (M.M.); (M.H.); (Y.H.); (J.L.); (M.L.); (M.Z.)
| | - Yinting He
- School of Bioengineering, Zhuhai Campus of Zunyi Medical University, Zhuhai 519000, China; (M.M.); (M.H.); (Y.H.); (J.L.); (M.L.); (M.Z.)
| | - Jiansong Fang
- Science and Technology Innovation Center, Guangzhou University of Chinese Medicine, Guangzhou 570000, China;
| | - Jiachao Li
- School of Bioengineering, Zhuhai Campus of Zunyi Medical University, Zhuhai 519000, China; (M.M.); (M.H.); (Y.H.); (J.L.); (M.L.); (M.Z.)
| | - Xiaohan Li
- School of Bioengineering, Zhuhai Campus of Zunyi Medical University, Zhuhai 519000, China; (M.M.); (M.H.); (Y.H.); (J.L.); (M.L.); (M.Z.)
| | - Mengchen Liu
- School of Bioengineering, Zhuhai Campus of Zunyi Medical University, Zhuhai 519000, China; (M.M.); (M.H.); (Y.H.); (J.L.); (M.L.); (M.Z.)
| | - Mei Zhou
- School of Bioengineering, Zhuhai Campus of Zunyi Medical University, Zhuhai 519000, China; (M.M.); (M.H.); (Y.H.); (J.L.); (M.L.); (M.Z.)
| | - Guozhen Cui
- School of Bioengineering, Zhuhai Campus of Zunyi Medical University, Zhuhai 519000, China; (M.M.); (M.H.); (Y.H.); (J.L.); (M.L.); (M.Z.)
| | - Qing Fan
- Basic Medical Science Department, Zhuhai Campus of Zunyi Medical University, Zhuhai 519041, China
| |
Collapse
|
2
|
Weir H, Thompson K, Woodward A, Choi B, Braun A, Martínez TJ. ChemPix: automated recognition of hand-drawn hydrocarbon structures using deep learning. Chem Sci 2021; 12:10622-10633. [PMID: 34447555 PMCID: PMC8365825 DOI: 10.1039/d1sc02957f] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 06/28/2021] [Indexed: 11/21/2022] Open
Abstract
Inputting molecules into chemistry software, such as quantum chemistry packages, currently requires domain expertise, expensive software and/or cumbersome procedures. Leveraging recent breakthroughs in machine learning, we develop ChemPix: an offline, hand-drawn hydrocarbon structure recognition tool designed to remove these barriers. A neural image captioning approach consisting of a convolutional neural network (CNN) encoder and a long short-term memory (LSTM) decoder learned a mapping from photographs of hand-drawn hydrocarbon structures to machine-readable SMILES representations. We generated a large auxiliary training dataset, based on RDKit molecular images, by combining image augmentation, image degradation and background addition. Additionally, a small dataset of ∼600 hand-drawn hydrocarbon chemical structures was crowd-sourced using a phone web application. These datasets were used to train the image-to-SMILES neural network with the goal of maximizing the hand-drawn hydrocarbon recognition accuracy. By forming a committee of the trained neural networks where each network casts one vote for the predicted molecule, we achieved a nearly 10 percentage point improvement of the molecule recognition accuracy and were able to assign a confidence value for the prediction based on the number of agreeing votes. The ensemble model achieved an accuracy of 76% on hand-drawn hydrocarbons, increasing to 86% if the top 3 predictions were considered.
Collapse
Affiliation(s)
- Hayley Weir
- Department of Chemistry, Stanford University Stanford CA 94305 USA
- SLAC National Accelerator Laboratory 2575 Sand Hill Road Menlo Park CA 94025 USA
| | - Keiran Thompson
- Department of Chemistry, Stanford University Stanford CA 94305 USA
- SLAC National Accelerator Laboratory 2575 Sand Hill Road Menlo Park CA 94025 USA
| | - Amelia Woodward
- Department of Chemistry, Stanford University Stanford CA 94305 USA
| | - Benjamin Choi
- Department of Electrical Engineering, Stanford University Stanford CA 94305 USA
| | - Augustin Braun
- Department of Chemistry, Stanford University Stanford CA 94305 USA
| | - Todd J Martínez
- Department of Chemistry, Stanford University Stanford CA 94305 USA
- SLAC National Accelerator Laboratory 2575 Sand Hill Road Menlo Park CA 94025 USA
| |
Collapse
|
3
|
Abstract
Within the last decade open data concepts has been gaining increasing interest in the area of drug discovery. With the launch of ChEMBL and PubChem, an enormous amount of bioactivity data was made easily accessible to the public domain. In addition, platforms that semantically integrate those data, such as the Open PHACTS Discovery Platform, permit querying across different domains of open life science data beyond the concept of ligand-target-pharmacology. However, most public databases are compiled from literature sources and are thus heterogeneous in their coverage. In addition, assay descriptions are not uniform and most often lack relevant information in the primary literature and, consequently, in databases. This raises the question how useful large public data sources are for deriving computational models. In this perspective, we highlight selected open-source initiatives and outline the possibilities and also the limitations when exploiting this huge amount of bioactivity data.
Collapse
|
4
|
Minkiewicz P, Miciński J, Darewicz M, Bucholska J. Biological and Chemical Databases for Research into the Composition of Animal Source Foods. FOOD REVIEWS INTERNATIONAL 2013. [DOI: 10.1080/87559129.2013.818011] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
5
|
Wang Y, Lonard DM, Yu Y, Chow DC, Palzkill TG, O'Malley BW. Small molecule inhibition of the steroid receptor coactivators, SRC-3 and SRC-1. Mol Endocrinol 2011; 25:2041-53. [PMID: 22053001 DOI: 10.1210/me.2011-1222] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Overexpression of steroid receptor coactivator (SRC)-1 and SRC-3 is associated with cancer initiation, metastasis, advanced disease, and resistance to chemotherapy. In most of these cases, SRC-1 and SRC-3 have been shown to promote tumor cell growth by activating nuclear receptor and multiple growth factor signaling cascades that lead to uncontrolled tumor cell growth. Up until now, most targeted chemotherapeutic drugs have been designed largely to block a single pathway at a time, but cancers frequently acquire resistance by switching to alternative growth factor pathways. We reason that the development of chemotherapeutic agents against SRC coactivators that sit at the nexus of multiple cell growth signaling networks and transcriptional factors should be particularly effective therapeutics. To substantiate this hypothesis, we report the discovery of 2,2'-bis-(Formyl-1,6,7-trihydroxy-5-isopropyl-3-methylnaphthalene (gossypol) as a small molecule inhibitor of coactivator SRC-1 and SRC-3. Our data indicate that gossypol binds directly to SRC-3 in its receptor interacting domain. In MCF-7 breast cancer cells, gossypol selectively reduces the cellular protein concentrations of SRC-1 and SRC-3 without generally altering overall protein expression patterns, SRC-2, or other coactivators, such as p300 and coactivator-associated arginine methyltransferase 1. Gossypol reduces the concentration of SRC-3 in prostate, lung, and liver cancer cell lines. Gossypol inhibits cell viability in the same cancer cell lines where it promotes SRC-3 down-regulation. Additionally, gossypol sensitizes lung and breast cancer cell lines to the inhibitory effects of other chemotherapeutic agents. Importantly, gossypol is selectively cytotoxic to cancer cells, whereas normal cell viability is not affected. This data establish the proof-of-principle that, as a class, SRC-1 and SRC-3 coactivators are accessible chemotherapeutic targets. Given their function as integrators of multiple cell growth signaling systems, SRC-1/SRC-3 small molecule inhibitors comprise a new class of drugs that have potential as novel chemotherapeutics able to defeat aspects of acquired cancer cell resistance mechanisms.
Collapse
Affiliation(s)
- Ying Wang
- Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, Texas 77030, USA
| | | | | | | | | | | |
Collapse
|
6
|
Zheng N, Tsai HN, Zhang X, Rosania GR. The subcellular distribution of small molecules: from pharmacokinetics to synthetic biology. Mol Pharm 2011; 8:1619-28. [PMID: 21805990 DOI: 10.1021/mp200092v] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
The systemic pharmacokinetics and pharmacodynamics of small molecules are determined by subcellular transport phenomena. Although approaches used to study the subcellular distribution of small molecules have gradually evolved over the past several decades, experimental analysis and prediction of cellular pharmacokinetics remains a challenge. In this review, we survey the progress of subcellular distribution research since the 1960s, with a focus on the advantages, disadvantages and limitations of the various experimental techniques. Critical review of the existing body of knowledge points to many opportunities to advance the rational design of organelle-targeted chemical agents. These opportunities include (1) development of quantitative, non-fluorescence-based, whole cell methods and techniques to measure the subcellular distribution of chemical agents in multiple compartments; (2) exploratory experimentation with nonspecific transport probes that have not been enriched with putative, organelle-targeting features; (3) elaboration of hypothesis-driven, mechanistic and modeling-based approaches to guide experiments aimed at elucidating subcellular distribution and transport; and (4) introduction of revolutionary conceptual approaches borrowed from the field of synthetic biology combined with cutting edge experimental strategies. In our laboratory, state-of-the-art subcellular transport studies are now being aimed at understanding the formation of new intracellular membrane structures in response to drug therapy, exploring the function of drug-membrane complexes as intracellular drug depots, and synthesizing new organelles with extraordinary physical and chemical properties.
Collapse
Affiliation(s)
- Nan Zheng
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Michigan, Ann Arbor, Michigan 48109, United States
| | | | | | | |
Collapse
|
7
|
Abstract
IMPORTANCE OF THE FIELD: PubChem is a public molecular information repository, a scientific showcase of the NIH Roadmap Initiative. The PubChem database holds over 27 million records of unique chemical structures of compounds (CID) derived from nearly 70 million substance depositions (SID), and contains more than 449,000 bioassay records with over thousands of in vitro biochemical and cell-based screening bioassays established, with targeting more than 7000 proteins and genes linking to over 1.8 million of substances. AREAS COVERED IN THIS REVIEW: This review builds on recent PubChem-related computational chemistry research reported by other authors while providing readers with an overview of the PubChem database, focusing on its increasing role in cheminformatics, virtual screening and toxicity prediction modeling. WHAT THE READER WILL GAIN: These publicly available datasets in PubChem provide great opportunities for scientists to perform cheminformatics and virtual screening research for computer-aided drug design. However, the high volume and complexity of the datasets, in particular the bioassay-associated false positives/negatives and highly imbalanced datasets in PubChem, also creates major challenges. Several approaches regarding the modeling of PubChem datasets and development of virtual screening models for bioactivity and toxicity predictions are also reviewed. TAKE HOME MESSAGE: Novel data-mining cheminformatics tools and virtual screening algorithms are being developed and used to retrieve, annotate and analyze the large-scale and highly complex PubChem biological screening data for drug design.
Collapse
Affiliation(s)
- Xiang-Qun Xie
- Department of Pharmaceutical Sciences, School of Pharmacy; Drug Discovery Institute/Pittsburgh Molecular Library Screening Center (PMLSC); Pittsburgh Chemical Methodologies & Library Development (PCMLD) Center; Departments of Computational Biology and Structural Biology; University of Pittsburgh, Pittsburgh, PA 15260, USA
| |
Collapse
|
8
|
Nashev LG, Schuster D, Laggner C, Sodha S, Langer T, Wolber G, Odermatt A. The UV-filter benzophenone-1 inhibits 17β-hydroxysteroid dehydrogenase type 3: Virtual screening as a strategy to identify potential endocrine disrupting chemicals. Biochem Pharmacol 2010; 79:1189-99. [DOI: 10.1016/j.bcp.2009.12.005] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2009] [Revised: 12/03/2009] [Accepted: 12/04/2009] [Indexed: 11/26/2022]
|
9
|
Li Q, Wang Y, Bryant SH. A novel method for mining highly imbalanced high-throughput screening data in PubChem. ACTA ACUST UNITED AC 2009; 25:3310-6. [PMID: 19825798 PMCID: PMC2788930 DOI: 10.1093/bioinformatics/btp589] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Motivation: The comprehensive information of small molecules and their biological activities in PubChem brings great opportunities for academic researchers. However, mining high-throughput screening (HTS) assay data remains a great challenge given the very large data volume and the highly imbalanced nature with only small number of active compounds compared to inactive compounds. Therefore, there is currently a need for better strategies to work with HTS assay data. Moreover, as luciferase-based HTS technology is frequently exploited in the assays deposited in PubChem, constructing a computational model to distinguish and filter out potential interference compounds for these assays is another motivation. Results: We used the granular support vector machines (SVMs) repetitive under sampling method (GSVM-RU) to construct an SVM from luciferase inhibition bioassay data that the imbalance ratio of active/inactive is high (1/377). The best model recognized the active and inactive compounds at the accuracies of 86.60% and 88.89 with a total accuracy of 87.74%, by cross-validation test and blind test. These results demonstrate the robustness of the model in handling the intrinsic imbalance problem in HTS data and it can be used as a virtual screening tool to identify potential interference compounds in luciferase-based HTS experiments. Additionally, this method has also proved computationally efficient by greatly reducing the computational cost and can be easily adopted in the analysis of HTS data for other biological systems. Availability: Data are publicly available in PubChem with AIDs of 773, 1006 and 1379. Contact:ywang@ncbi.nlm.nih.gov; bryant@ncbi.nlm.nih.gov Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Qingliang Li
- National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD 20894, USA
| | | | | |
Collapse
|
10
|
Park J, Rosania GR, Saitou K. Tunable machine vision-based strategy for automated annotation of chemical databases. J Chem Inf Model 2009; 49:1993-2001. [PMID: 19621901 PMCID: PMC2907084 DOI: 10.1021/ci900029v] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We present a tunable, machine vision-based strategy for automated annotation of virtual small molecule databases. The proposed strategy is based on the use of a machine vision-based tool for extracting structure diagrams in research articles and converting them into connection tables, a virtual "Chemical Expert" system for screening the converted structures based on the adjustable levels of estimated conversion accuracy, and a fragment-based measure for calculating intermolecular similarity. For annotation, calculated chemical similarity between the converted structures and entries in a virtual small molecule database is used to establish the links. The overall annotation performances can be tuned by adjusting the cutoff threshold of the estimated conversion accuracy. We perform an annotation test which attempts to link 121 journal articles registered in PubMed to entries in PubChem which is the largest, publicly accessible chemical database. Two cases of tests are performed, and their results are compared to see how the overall annotation performances are affected by the different threshold levels of the estimated accuracy of the converted structure. Our work demonstrates that over 45% of the articles could have true positive links to entries in the PubChem database with promising recall and precision rates in both tests. Furthermore, we illustrate that the Chemical Expert system which can screen converted structures based on the adjustable levels of estimated conversion accuracy is a key factor impacting the overall annotation performance. We propose that this machine vision-based strategy can be incorporated with the text-mining approach to facilitate extraction of contextual scientific knowledge about a chemical structure, from the scientific literature.
Collapse
Affiliation(s)
- Jungkap Park
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, Michigan 48109, ,
| | - Gus R. Rosania
- Department of Pharmaceutical Sciences, University of Michigan, Ann Arbor, Michigan 48109,
| | - Kazuhiro Saitou
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, Michigan 48109, ,
| |
Collapse
|
11
|
Park J, Rosania GR, Shedden KA, Nguyen M, Lyu N, Saitou K. Automated extraction of chemical structure information from digital raster images. Chem Cent J 2009; 3:4. [PMID: 19196483 PMCID: PMC2648963 DOI: 10.1186/1752-153x-3-4] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2008] [Accepted: 02/05/2009] [Indexed: 11/16/2022] Open
Abstract
Background To search for chemical structures in research articles, diagrams or text representing molecules need to be translated to a standard chemical file format compatible with cheminformatic search engines. Nevertheless, chemical information contained in research articles is often referenced as analog diagrams of chemical structures embedded in digital raster images. To automate analog-to-digital conversion of chemical structure diagrams in scientific research articles, several software systems have been developed. But their algorithmic performance and utility in cheminformatic research have not been investigated. Results This paper aims to provide critical reviews for these systems and also report our recent development of ChemReader – a fully automated tool for extracting chemical structure diagrams in research articles and converting them into standard, searchable chemical file formats. Basic algorithms for recognizing lines and letters representing bonds and atoms in chemical structure diagrams can be independently run in sequence from a graphical user interface-and the algorithm parameters can be readily changed-to facilitate additional development specifically tailored to a chemical database annotation scheme. Compared with existing software programs such as OSRA, Kekule, and CLiDE, our results indicate that ChemReader outperforms other software systems on several sets of sample images from diverse sources in terms of the rate of correct outputs and the accuracy on extracting molecular substructure patterns. Conclusion The availability of ChemReader as a cheminformatic tool for extracting chemical structure information from digital raster images allows research and development groups to enrich their chemical structure databases by annotating the entries with published research articles. Based on its stable performance and high accuracy, ChemReader may be sufficiently accurate for annotating the chemical database with links to scientific research articles.
Collapse
Affiliation(s)
- Jungkap Park
- Michigan Alliance for Cheminformatic Exploration, Ann Arbor, MI, USA.
| | | | | | | | | | | |
Collapse
|
12
|
Current world literature. Ageing: biology and nutrition. Curr Opin Clin Nutr Metab Care 2009; 12:95-100. [PMID: 19057195 DOI: 10.1097/mco.0b013e32831fd97a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
13
|
|
14
|
Weis DC, Visco DP, Faulon JL. Data mining PubChem using a support vector machine with the Signature molecular descriptor: classification of factor XIa inhibitors. J Mol Graph Model 2008; 27:466-75. [PMID: 18829357 DOI: 10.1016/j.jmgm.2008.08.004] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2008] [Revised: 08/19/2008] [Accepted: 08/20/2008] [Indexed: 01/04/2023]
Abstract
The amount of high-throughput screening (HTS) data readily available has significantly increased because of the PubChem project (http://pubchem.ncbi.nlm.nih.gov/). There is considerable opportunity for data mining of small molecules for a variety of biological systems using cheminformatic tools and the resources available through PubChem. In this work, we trained a support vector machine (SVM) classifier using the Signature molecular descriptor on factor XIa inhibitor HTS data. The optimal number of Signatures was selected by implementing a feature selection algorithm of highly correlated clusters. Our method included an improvement that allowed clusters to work together for accuracy improvement, where previous methods have scored clusters on an individual basis. The resulting model had a 10-fold cross-validation accuracy of 89%, and additional validation was provided by two independent test sets. We applied the SVM to rapidly predict activity for approximately 12 million compounds also deposited in PubChem. Confidence in these predictions was assessed by considering the number of Signatures within the training set range for a given compound, defined as the overlap metric. To further evaluate compounds identified as active by the SVM, docking studies were performed using AutoDock. A focused database of compounds predicted to be active was obtained with several of the compounds appreciably dissimilar to those used in training the SVM. This focused database is suitable for further study. The data mining technique presented here is not specific to factor XIa inhibitors, and could be applied to other bioassays in PubChem where one is looking to expand the search for small molecules as chemical probes.
Collapse
Affiliation(s)
- Derick C Weis
- Department of Chemical Engineering, Tennessee Technological University, Box 5013, Cookeville, TN 38505, USA.
| | | | | |
Collapse
|