1
|
Qiu J, Yan X, Tian Y, Li Q, Liu X, Yang Y, Tong HHY, Liu H. PTB-DDI: An Accurate and Simple Framework for Drug-Drug Interaction Prediction Based on Pre-Trained Tokenizer and BiLSTM Model. Int J Mol Sci 2024; 25:11385. [PMID: 39518938 PMCID: PMC11546514 DOI: 10.3390/ijms252111385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2024] [Revised: 10/17/2024] [Accepted: 10/21/2024] [Indexed: 11/16/2024] Open
Abstract
The simultaneous use of two or more drugs in clinical treatment may raise the risk of a drug-drug interaction (DDI). DDI prediction is very important to avoid adverse drug events in combination therapy. Recently, deep learning methods have been applied successfully to DDI prediction and improved prediction performance. However, there are still some problems with the present models, such as low accuracy due to information loss during molecular representation or incomplete drug feature mining during the training process. Aiming at these problems, this study proposes an accurate and simple framework named PTB-DDI for drug-drug interaction prediction. The PTB-DDI framework consists of four key modules: (1) ChemBerta tokenizer for molecular representation, (2) Bidirectional Long Short-Term Memory (BiLSTM) to capture the bidirectional context-aware features of drugs, (3) Multilayer Perceptron (MLP) for mining the nonlinear relationship of drug features, and (4) interaction predictor to perform an affine transformation and final prediction. In addition, we investigate the effect of dual-mode on parameter-sharing and parameter-independent within the PTB-DDI framework. Furthermore, we conducted comprehensive experiments on the two real-world datasets (i.e., BIOSNAP and DrugBank) to evaluate PTB-DDI framework performance. The results show that our proposed framework has significant improvements over the baselines based on both datasets. Based on the BIOSNAP dataset, the AUC-ROC, PR-AUC, and F1 scores are 0.997, 0.995, and 0.984, respectively. These metrics are 0.896, 0.873, and 0.826 based on the DrugBank dataset. Then, we conduct the case studies on the three newly approved drugs by the Food and Drug Administration (FDA) in 2024 using the PTB-DDI framework in dual modes. The obtained results indicate that our proposed framework has advantages for predicting drug-drug interactions and that the dual modes of the framework complement each other. Furthermore, a free website is developed to enhance accessibility and user experience.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Huanxiang Liu
- Faculty of Applied Sciences, Macao Polytechnic University, Macao SAR, China; (J.Q.); (X.Y.); (Y.T.); (Q.L.); (X.L.); (Y.Y.); (H.H.Y.T.)
| |
Collapse
|
2
|
Ahn S, Lee SE, Kim MH. Random-forest model for drug-target interaction prediction via Kullbeck-Leibler divergence. J Cheminform 2022; 14:67. [PMID: 36192818 PMCID: PMC9531514 DOI: 10.1186/s13321-022-00644-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Accepted: 09/11/2022] [Indexed: 12/04/2022] Open
Abstract
Virtual screening has significantly improved the success rate of early stage drug discovery. Recent virtual screening methods have improved owing to advances in machine learning and chemical information. Among these advances, the creative extraction of drug features is important for predicting drug–target interaction (DTI), which is a large-scale virtual screening of known drugs. Herein, we report Kullbeck–Leibler divergence (KLD) as a DTI feature and the feature-driven classification model applicable to DTI prediction. For the purpose, E3FP three-dimensional (3D) molecular fingerprints of drugs as a molecular representation allow the computation of 3D similarities between ligands within each target (Q–Q matrix) to identify the uniqueness of pharmacological targets and those between a query and a ligand (Q–L vector) in DTIs. The 3D similarity matrices are transformed into probability density functions via kernel density estimation as a nonparametric estimation. Each density model can exploit the characteristics of each pharmacological target and measure the quasi-distance between the ligands. Furthermore, we developed a random forest model from the KLD feature vectors to successfully predict DTIs for representative 17 targets (mean accuracy: 0.882, out-of-bag score estimate: 0.876, ROC AUC: 0.990). The method is applicable for 2D chemical similarity.
Collapse
Affiliation(s)
- Sangjin Ahn
- Gachon Institute of Pharmaceutical Science and Department of Pharmacy, College of Pharmacy, Gachon University, 191 Hambakmoeiro, Yeonsu-gu, Incheon, Republic of Korea.,Department of Artificial Intelligence, Ajou University, Suwon, 16499, Republic of Korea
| | - Si Eun Lee
- Gachon Institute of Pharmaceutical Science and Department of Pharmacy, College of Pharmacy, Gachon University, 191 Hambakmoeiro, Yeonsu-gu, Incheon, Republic of Korea
| | - Mi-Hyun Kim
- Gachon Institute of Pharmaceutical Science and Department of Pharmacy, College of Pharmacy, Gachon University, 191 Hambakmoeiro, Yeonsu-gu, Incheon, Republic of Korea.
| |
Collapse
|
3
|
Mashabela MD, Masamba P, Kappo AP. Metabolomics and Chemoinformatics in Agricultural Biotechnology Research: Complementary Probes in Unravelling New Metabolites for Crop Improvement. BIOLOGY 2022; 11:1156. [PMID: 36009783 PMCID: PMC9405339 DOI: 10.3390/biology11081156] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 07/16/2022] [Accepted: 07/28/2022] [Indexed: 11/25/2022]
Abstract
The United Nations (UN) estimate that the global population will reach 10 billion people by 2050. These projections have placed the agroeconomic industry under immense pressure to meet the growing demand for food and maintain global food security. However, factors associated with climate variability and the emergence of virulent plant pathogens and pests pose a considerable threat to meeting these demands. Advanced crop improvement strategies are required to circumvent the deleterious effects of biotic and abiotic stress and improve yields. Metabolomics is an emerging field in the omics pipeline and systems biology concerned with the quantitative and qualitative analysis of metabolites from a biological specimen under specified conditions. In the past few decades, metabolomics techniques have been extensively used to decipher and describe the metabolic networks associated with plant growth and development and the response and adaptation to biotic and abiotic stress. In recent years, metabolomics technologies, particularly plant metabolomics, have expanded to screening metabolic biomarkers for enhanced performance in yield and stress tolerance for metabolomics-assisted breeding. This review explores the recent advances in the application of metabolomics in agricultural biotechnology for biomarker discovery and the identification of new metabolites for crop improvement. We describe the basic plant metabolomics workflow, the essential analytical techniques, and the power of these combined analytical techniques with chemometrics and chemoinformatics tools. Furthermore, there are mentions of integrated omics systems for metabolomics-assisted breeding and of current applications.
Collapse
Affiliation(s)
| | | | - Abidemi Paul Kappo
- Department of Biochemistry, Faculty of Science, University of Johannesburg, Auckland Park Kingsway Campus, P.O. Box 524, Johannesburg 2006, South Africa; (M.D.M.); (P.M.)
| |
Collapse
|
4
|
Prediction of Blood-Brain Barrier Penetration (BBBP) Based on Molecular Descriptors of the Free-Form and In-Blood-Form Datasets. Molecules 2021; 26:molecules26247428. [PMID: 34946509 PMCID: PMC8708321 DOI: 10.3390/molecules26247428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 11/28/2021] [Accepted: 12/03/2021] [Indexed: 11/16/2022] Open
Abstract
The blood-brain barrier (BBB) controls the entry of chemicals from the blood to the brain. Since brain drugs need to penetrate the BBB, rapid and reliable prediction of BBB penetration (BBBP) is helpful for drug development. In this study, free-form and in-blood-form datasets were prepared by modifying the original BBBP dataset, and the effects of the data modification were investigated. For each dataset, molecular descriptors were generated and used for BBBP prediction by machine learning (ML). For ML, the dataset was split into training, validation, and test data by the scaffold split algorithm MoleculeNet used. This creates an unbalanced split and makes the prediction difficult; however, we decided to use that algorithm to evaluate the predictive performance for unknown compounds dissimilar to existing ones. The highest prediction score was obtained by the random forest model using 212 descriptors from the free-form dataset, and this score was higher than the existing best score using the same split algorithm without using any external database. Furthermore, using a deep neural network, a comparable result was obtained with only 11 descriptors from the free-form dataset, and the resulting descriptors suggested the importance of recognizing the glucose-like characteristics in BBBP prediction.
Collapse
|
5
|
Oselusi SO, Christoffels A, Egieyeh SA. Cheminformatic Characterization of Natural Antimicrobial Products for the Development of New Lead Compounds. Molecules 2021; 26:molecules26133970. [PMID: 34209681 PMCID: PMC8271829 DOI: 10.3390/molecules26133970] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2021] [Revised: 05/29/2021] [Accepted: 06/02/2021] [Indexed: 12/26/2022] Open
Abstract
The growing antimicrobial resistance (AMR) of pathogenic organisms to currently prescribed drugs has resulted in the failure to treat various infections caused by these superbugs. Therefore, to keep pace with the increasing drug resistance, there is a pressing need for novel antimicrobial agents, especially from non-conventional sources. Several natural products (NPs) have been shown to display promising in vitro activities against multidrug-resistant pathogens. Still, only a few of these compounds have been studied as prospective drug candidates. This may be due to the expensive and time-consuming process of conducting important studies on these compounds. The present review focuses on applying cheminformatics strategies to characterize, prioritize, and optimize NPs to develop new lead compounds against antimicrobial resistance pathogens. Moreover, case studies where these strategies have been used to identify potential drug candidates, including a few selected open-access tools commonly used for these studies, are briefly outlined.
Collapse
Affiliation(s)
- Samson Olaitan Oselusi
- School of Pharmacy, University of the Western Cape, Bellville, Cape Town 7535, South Africa;
- Correspondence:
| | - Alan Christoffels
- South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Cape Town 7535, South Africa;
| | - Samuel Ayodele Egieyeh
- School of Pharmacy, University of the Western Cape, Bellville, Cape Town 7535, South Africa;
| |
Collapse
|
6
|
Salerno L, Vanella L, Sorrenti V, Consoli V, Ciaffaglione V, Fallica AN, Canale V, Zajdel P, Pignatello R, Intagliata S. Novel mutual prodrug of 5-fluorouracil and heme oxygenase-1 inhibitor (5-FU/HO-1 hybrid): design and preliminary in vitro evaluation. J Enzyme Inhib Med Chem 2021; 36:1378-1386. [PMID: 34167427 PMCID: PMC8231349 DOI: 10.1080/14756366.2021.1928111] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
In this work, the first mutual prodrug of 5-fluorouracil and heme oxygenase1 inhibitor (5-FU/HO-1 hybrid) has been designed, synthesised, and evaluated for its in vitro chemical and enzymatic hydrolysis stability. Predicted in silico physicochemical properties of the newly synthesised hybrid (3) demonstrated a drug-like profile with suitable Absorption, Distribution, Metabolism, and Excretion (ADME) properties and low toxic liabilities. Preliminary cytotoxicity evaluation towards human prostate (DU145) and lung (A549) cancer cell lines demonstrated that 3 exerted a similar effect on cell viability to that produced by the reference drug 5-FU. Among the two tested cancer cell lines, the A549 cells were more susceptible for 3. Of note, hybrid 3 also had a significantly lower cytotoxic effect on healthy human lung epithelial cells (BEAS-2B) than 5-FU. Altogether our results served as an initial proof-of-concept to develop 5-FU/HO-1 mutual prodrugs as potential novel anticancer agents.
Collapse
Affiliation(s)
- Loredana Salerno
- Department of Drug and Health Sciences, University of Catania, Catania, Italy
| | - Luca Vanella
- Department of Drug and Health Sciences, University of Catania, Catania, Italy
| | - Valeria Sorrenti
- Department of Drug and Health Sciences, University of Catania, Catania, Italy
| | - Valeria Consoli
- Department of Drug and Health Sciences, University of Catania, Catania, Italy
| | | | - Antonino N Fallica
- Department of Drug and Health Sciences, University of Catania, Catania, Italy
| | - Vittorio Canale
- Department of Organic Chemistry, Jagiellonian University Medical College, Kraków, Poland
| | - Paweł Zajdel
- Department of Organic Chemistry, Jagiellonian University Medical College, Kraków, Poland
| | - Rosario Pignatello
- Department of Drug and Health Sciences, University of Catania, Catania, Italy
| | | |
Collapse
|
7
|
Martinez-Mayorga K, Madariaga-Mazon A, Medina-Franco JL, Maggiora G. The impact of chemoinformatics on drug discovery in the pharmaceutical industry. Expert Opin Drug Discov 2020; 15:293-306. [PMID: 31965870 DOI: 10.1080/17460441.2020.1696307] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Introduction: Even though there have been substantial advances in our understanding of biological systems, research in drug discovery is only just now beginning to utilize this type of information. The single-target paradigm, which exemplifies the reductionist approach, remains a mainstay of drug research today. A deeper view of the complexity involved in drug discovery is necessary to advance on this field.Areas covered: This perspective provides a summary of research areas where cheminformatics has played a key role in drug discovery, including of the available resources as well as a personal perspective of the challenges still faced in the field.Expert opinion: Although great strides have been made in the handling and analysis of biological and pharmacological data, more must be done to link the data to biological pathways. This is crucial if one is to understand how drugs modify disease phenotypes, although this will involve a shift from the single drug/single target paradigm that remains a mainstay of drug research. Moreover, such a shift would require an increased awareness of the role of physiology in the mechanism of drug action, which will require the introduction of new mathematical, computer, and biological methods for chemoinformaticians to be trained in.
Collapse
Affiliation(s)
| | | | - José L Medina-Franco
- Facultad de Química, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | | |
Collapse
|
8
|
Pawar G, Madden JC, Ebbrell D, Firman JW, Cronin MTD. In Silico Toxicology Data Resources to Support Read-Across and (Q)SAR. Front Pharmacol 2019; 10:561. [PMID: 31244651 PMCID: PMC6580867 DOI: 10.3389/fphar.2019.00561] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2019] [Accepted: 05/03/2019] [Indexed: 12/14/2022] Open
Abstract
A plethora of databases exist online that can assist in in silico chemical or drug safety assessment. However, a systematic review and grouping of databases, based on purpose and information content, consolidated in a single source, has been lacking. To resolve this issue, this review provides a comprehensive listing of the key in silico data resources relevant to: chemical identity and properties, drug action, toxicology (including nano-material toxicity), exposure, omics, pathways, Absorption, Distribution, Metabolism and Elimination (ADME) properties, clinical trials, pharmacovigilance, patents-related databases, biological (genes, enzymes, proteins, other macromolecules etc.) databases, protein-protein interactions (PPIs), environmental exposure related, and finally databases relating to animal alternatives in support of 3Rs policies. More than nine hundred databases were identified and reviewed against criteria relating to accessibility, data coverage, interoperability or application programming interface (API), appropriate identifiers, types of in vitro, in vivo,-clinical or other data recorded and suitability for modelling, read-across, or similarity searching. This review also specifically addresses the need for solutions for mapping and integration of databases into a common platform for better translatability of preclinical data to clinical data.
Collapse
Affiliation(s)
| | | | | | | | - Mark T. D. Cronin
- School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, United Kingdom
| |
Collapse
|
9
|
Koulouridi E, Valli M, Ntie-Kang F, Bolzani VDS. A primer on natural product-based virtual screening. PHYSICAL SCIENCES REVIEWS 2019. [DOI: 10.1515/psr-2018-0105] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Abstract
Databases play an important role in various computational techniques, including virtual screening (VS) and molecular modeling in general. These collections of molecules can contain a large amount of information, making them suitable for several drug discovery applications. For example, vendor, bioactivity data or target type can be found when searching a database. The introduction of these data resources and their characteristics is used for the design of an experiment. The description of the construction of a database can also be a good advisor for the creation of a new one. There are free available databases and commercial virtual libraries of molecules. Furthermore, a computational chemist can find databases for a general purpose or a specific subset such as natural products (NPs). In this chapter, NP database resources are presented, along with some guidelines when preparing an NP database for drug discovery purposes.
Collapse
|
10
|
Douguet D. Data Sets Representative of the Structures and Experimental Properties of FDA-Approved Drugs. ACS Med Chem Lett 2018. [PMID: 29541361 DOI: 10.1021/acsmedchemlett.7b00462] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Presented here are several data sets that gather information collected from the labels of the FDA approved drugs: their molecular structures and those of the described active metabolites, their associated pharmacokinetics and pharmacodynamics data, and the history of their marketing authorization by the FDA. To date, 1852 chemical structures have been identified with a molecular weight less than 2000 of which 492 are or have active metabolites. To promote the sharing of data, the original web server was upgraded for browsing the database and downloading the data sets (http://chemoinfo.ipmc.cnrs.fr/edrug3d). It is believed that the multidimensional chemistry-oriented collections are an essential resource for a thorough analysis of the current drug chemical space. The data sets are envisioned as being used in a wide range of endeavors that include drug repurposing, drug design, privileged structures analyses, structure-activity relationship studies, and improving of absorption, distribution, metabolism, and elimination predictive models.
Collapse
Affiliation(s)
- Dominique Douguet
- Université Côte d’Azur, Inserm, CNRS, IPMC, 660 Route des Lucioles, 06560 Valbonne, France
| |
Collapse
|
11
|
Accelerating Group Fusion for Ligand-Based Virtual Screening on Multi-core and Many-core Platforms. JOURNAL OF INFORMATION PROCESSING SYSTEMS 2016. [DOI: 10.3745/jips.01.0012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
12
|
Iwaniak A, Minkiewicz P, Darewicz M, Protasiewicz M, Mogut D. Chemometrics and cheminformatics in the analysis of biologically active peptides from food sources. J Funct Foods 2015. [DOI: 10.1016/j.jff.2015.04.038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
|
13
|
Cao M, Fraser K, Huege J, Featonby T, Rasmussen S, Jones C. Predicting retention time in hydrophilic interaction liquid chromatography mass spectrometry and its use for peak annotation in metabolomics. Metabolomics 2014; 11:696-706. [PMID: 25972771 PMCID: PMC4419193 DOI: 10.1007/s11306-014-0727-x] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/25/2014] [Accepted: 08/28/2014] [Indexed: 01/14/2023]
Abstract
Liquid chromatography coupled to mass spectrometry (LCMS) is widely used in metabolomics due to its sensitivity, reproducibility, speed and versatility. Metabolites are detected as peaks which are characterised by mass-over-charge ratio (m/z) and retention time (rt), and one of the most critical but also the most challenging tasks in metabolomics is to annotate the large number of peaks detected in biological samples. Accurate m/z measurements enable the prediction of molecular formulae which provide clues to the chemical identity of peaks, but often a number of metabolites have identical molecular formulae. Chromatographic behaviour, reflecting the physicochemical properties of metabolites, should also provide structural information. However, the variation in rt between analytical runs, and the complicating factors underlying the observed time shifts, make the use of such information for peak annotation a non-trivial task. To this end, we conducted Quantitative Structure-Retention Relationship (QSRR) modelling between the calculated molecular descriptors (MDs) and the experimental retention times (rts) of 93 authentic compounds analysed using hydrophilic interaction liquid chromatography (HILIC) coupled to high resolution MS. A predictive QSRR model based on Random Forests algorithm outperformed a Multiple Linear Regression based model, and achieved a high correlation between predicted rts and experimental rts (Pearson's correlation coefficient = 0.97), with mean and median absolute error of 0.52 min and 0.34 min (corresponding to 5.1 and 3.2 % error), respectively. We demonstrate that rt prediction with the precision achieved enables the systematic utilisation of rts for annotating unknown peaks detected in a metabolomics study. The application of the QSRR model with the strategy we outlined enhanced the peak annotation process by reducing the number of false positives resulting from database queries by matching accurate mass alone, and enriching the reference library. The predicted rts were validated using either authentic compounds or ion fragmentation patterns.
Collapse
Affiliation(s)
- Mingshu Cao
- AgResearch Grasslands Research Centre, Palmerston North, 4442 New Zealand
| | - Karl Fraser
- AgResearch Grasslands Research Centre, Palmerston North, 4442 New Zealand
| | - Jan Huege
- AgResearch Grasslands Research Centre, Palmerston North, 4442 New Zealand
| | - Tom Featonby
- AgResearch Grasslands Research Centre, Palmerston North, 4442 New Zealand
| | - Susanne Rasmussen
- Massey University, Institute of Agriculture and Environment, Palmerston North, New Zealand
| | - Chris Jones
- AgResearch Grasslands Research Centre, Palmerston North, 4442 New Zealand
| |
Collapse
|
14
|
In silico molecular modeling and prediction of activity of substituted tetrahydropyrans as COX-2 inhibitor. Med Chem Res 2014. [DOI: 10.1007/s00044-014-1148-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
15
|
Tambunan USF, Pratomo H, Parikesit AA. Modification of Kampmann A5 as Potential Fusion Inhibitor of Dengue Virus using Molecular Docking and Molecular Dynamics Approach. JOURNAL OF MEDICAL SCIENCES 2013. [DOI: 10.3923/jms.2013.621.634] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
|
16
|
Jacobsen UP, Nielsen HB, Hildebrand F, Raes J, Sicheritz-Ponten T, Kouskoumvekaki I, Panagiotou G. The chemical interactome space between the human host and the genetically defined gut metabotypes. THE ISME JOURNAL 2013; 7:730-42. [PMID: 23178670 PMCID: PMC3603391 DOI: 10.1038/ismej.2012.141] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/14/2012] [Revised: 07/30/2012] [Accepted: 09/28/2012] [Indexed: 01/07/2023]
Abstract
The bacteria that colonize the gastrointestinal tracts of mammals represent a highly selected microbiome that has a profound influence on human physiology by shaping the host's metabolic and immune system activity. Despite the recent advances on the biological principles that underlie microbial symbiosis in the gut of mammals, mechanistic understanding of the contributions of the gut microbiome and how variations in the metabotypes are linked to the host health are obscure. Here, we mapped the entire metabolic potential of the gut microbiome based solely on metagenomics sequencing data derived from fecal samples of 124 Europeans (healthy, obese and with inflammatory bowel disease). Interestingly, three distinct clusters of individuals with high, medium and low metabolic potential were observed. By illustrating these results in the context of bacterial population, we concluded that the abundance of the Prevotella genera is a key factor indicating a low metabolic potential. These metagenome-based metabolic signatures were used to study the interaction networks between bacteria-specific metabolites and human proteins. We found that thirty-three such metabolites interact with disease-relevant protein complexes several of which are highly expressed in cells and tissues involved in the signaling and shaping of the adaptive immune system and associated with squamous cell carcinoma and bladder cancer. From this set of metabolites, eighteen are present in DrugBank providing evidence that we carry a natural pharmacy in our guts. Furthermore, we established connections between the systemic effects of non-antibiotic drugs and the gut microbiome of relevance to drug side effects and health-care solutions.
Collapse
Affiliation(s)
- Ulrik Plesner Jacobsen
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark
| | - Henrik Bjørn Nielsen
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark
- NNF-Center for Biosustainability, Technical University of Denmark, Horsholm, Denmark
| | - Falk Hildebrand
- Research Group of Bioinformatics and (eco-)systems biology, Department of Structural Biology, VIB, Brussels, Belgium
- Research Group of Bioinformatics and (eco-)systems biology, Microbiology Unit (MICR), Department of Applied Biological Sciences (DBIT), Vrije Universiteit Brussel, Brussels, Belgium
| | - Jeroen Raes
- Research Group of Bioinformatics and (eco-)systems biology, Department of Structural Biology, VIB, Brussels, Belgium
- Research Group of Bioinformatics and (eco-)systems biology, Microbiology Unit (MICR), Department of Applied Biological Sciences (DBIT), Vrije Universiteit Brussel, Brussels, Belgium
| | - Thomas Sicheritz-Ponten
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark
- NNF-Center for Biosustainability, Technical University of Denmark, Horsholm, Denmark
| | - Irene Kouskoumvekaki
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark
| | - Gianni Panagiotou
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Lyngby, Denmark
- NNF-Center for Biosustainability, Technical University of Denmark, Horsholm, Denmark
- School of Biological Sciences, The University of Hong Kong, Hong Kong, China
| |
Collapse
|
17
|
Opletalová V, Kastner P, Kučerová-Chlupáčová M, Palát K. Study of hydrophobic properties of biologically active open analogues of flavonoids. J Mol Graph Model 2013; 39:61-4. [DOI: 10.1016/j.jmgm.2012.07.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2011] [Revised: 07/03/2012] [Accepted: 07/30/2012] [Indexed: 11/30/2022]
|
18
|
Hechinger M, Leonhard K, Marquardt W. What is Wrong with Quantitative Structure–Property Relations Models Based on Three-Dimensional Descriptors? J Chem Inf Model 2012; 52:1984-93. [DOI: 10.1021/ci300246m] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- M. Hechinger
- AVT-Process
Systems Engineering and ‡Chair of Technical Thermodynamics, RWTH Aachen University, 52064 Aachen, Germany
| | - K. Leonhard
- AVT-Process
Systems Engineering and ‡Chair of Technical Thermodynamics, RWTH Aachen University, 52064 Aachen, Germany
| | - W. Marquardt
- AVT-Process
Systems Engineering and ‡Chair of Technical Thermodynamics, RWTH Aachen University, 52064 Aachen, Germany
| |
Collapse
|
19
|
Kumar AB, Anderson JM, Melendez AL, Manetsch R. Synthesis and structure-activity relationship studies of 1,3-disubstituted 2-propanols as BACE-1 inhibitors. Bioorg Med Chem Lett 2012; 22:4740-4. [PMID: 22727644 DOI: 10.1016/j.bmcl.2012.05.072] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2012] [Revised: 05/16/2012] [Accepted: 05/18/2012] [Indexed: 12/11/2022]
Abstract
A library of 1,3-disubstituted 2-propanols was synthesized and evaluated as low molecular weight probes for β-secretase inhibition. By screening a library of 121 1,3-disubstituted 2-propanol derivatives, we identified few compounds inhibiting the enzyme at low micromolar concentrations. The initial hits were optimized to yield a potent BACE-1 inhibitor exhibiting an IC(50) constant in the nanomolar range. Exploration of the pharmacological properties revealed that these small molecular inhibitors possessed a high selectivity over cathepsin D and desirable physicochemical properties beneficial to cross the blood-brain barrier.
Collapse
Affiliation(s)
- Arun Babu Kumar
- Department of Chemistry, University of South Florida, CHE 205, 4202 E. Fowler Ave, Tampa, FL 33620, USA
| | | | | | | |
Collapse
|
20
|
Abstract
Computational methods now play an integral role in modern drug discovery, and include the design and management of small molecule libraries, initial hit identification through virtual screening, optimization of the affinity and selectivity of hits, and improving the physicochemical properties of the lead compounds. In this chapter, we survey the most important data sources for the discovery of new molecular entities, and discuss the key considerations and guidelines for virtual chemical library design.
Collapse
Affiliation(s)
- Paul H Bernardo
- Institute of Chemical and Engineering Sciences, Agency for Science Technology and Research (A STAR), Singapore, Singapore
| | | |
Collapse
|
21
|
Wan P, Li Q, Larsen JEP, Eklund AC, Parlesak A, Rigina O, Nielsen SJ, Björkling F, Jónsdóttir SÓ. Prediction of drug efficacy for cancer treatment based on comparative analysis of chemosensitivity and gene expression data. Bioorg Med Chem 2011; 20:167-76. [PMID: 22154557 DOI: 10.1016/j.bmc.2011.11.019] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2011] [Revised: 11/06/2011] [Accepted: 11/11/2011] [Indexed: 01/24/2023]
Abstract
The NCI60 database is the largest available collection of compounds with measured anti-cancer activity. The strengths and limitations for using the NCI60 database as a source of new anti-cancer agents are explored and discussed in relation to previous studies. We selected a sub-set of 2333 compounds with reliable experimental half maximum growth inhibitions (GI(50)) values for 30 cell lines from the NCI60 data set and evaluated their growth inhibitory effect (chemosensitivity) with respect to tissue of origin. This was done by identifying natural clusters in the chemosensitivity data set and in a data set of expression profiles of 1901 genes for the corresponding tumor cell lines. Five clusters were identified based on the gene expression data using self-organizing maps (SOM), comprising leukemia, melanoma, ovarian and prostate, basal breast, and luminal breast cancer cells, respectively. The strong difference in gene expression between basal and luminal breast cancer cells was reflected clearly in the chemosensitivity data. Although most compounds in the data set were of low potency, high efficacy compounds that showed specificity with respect to tissue of origin could be found. Furthermore, eight potential topoisomerase II inhibitors were identified using a structural similarity search. Finally, a set of genes with expression profiles that were significantly correlated with anti-cancer drug activity was identified. Our study demonstrates that the combined data sets, which provide comprehensive information on drug activity and gene expression profiles of tumor cell lines studied, are useful for identifying potential new active compounds.
Collapse
Affiliation(s)
- Peng Wan
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Bldg. 208, DK-2800 Kgs. Lyngby, Denmark.
| | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Bak A, Magdziarz T, Kurczyk A, Polanski J. Mapping drug architecture by MoStBioDat: rapid screening of intramolecular hydrogen bonded motifs in catechols. Drug Dev Res 2010. [DOI: 10.1002/ddr.20417] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
23
|
Maqungo M, Kaur M, Kwofie SK, Radovanovic A, Schaefer U, Schmeier S, Oppon E, Christoffels A, Bajic VB. DDPC: Dragon Database of Genes associated with Prostate Cancer. Nucleic Acids Res 2010; 39:D980-5. [PMID: 20880996 PMCID: PMC3013759 DOI: 10.1093/nar/gkq849] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Prostate cancer (PC) is one of the most commonly diagnosed cancers in men. PC is relatively difficult to diagnose due to a lack of clear early symptoms. Extensive research of PC has led to the availability of a large amount of data on PC. Several hundred genes are implicated in different stages of PC, which may help in developing diagnostic methods or even cures. In spite of this accumulated information, effective diagnostics and treatments remain evasive. We have developed Dragon Database of Genes associated with Prostate Cancer (DDPC) as an integrated knowledgebase of genes experimentally verified as implicated in PC. DDPC is distinctive from other databases in that (i) it provides pre-compiled biomedical text-mining information on PC, which otherwise require tedious computational analyses, (ii) it integrates data on molecular interactions, pathways, gene ontologies, gene regulation at molecular level, predicted transcription factor binding sites on promoters of PC implicated genes and transcription factors that correspond to these binding sites and (iii) it contains DrugBank data on drugs associated with PC. We believe this resource will serve as a source of useful information for research on PC. DDPC is freely accessible for academic and non-profit users via http://apps.sanbi.ac.za/ddpc/ and http://cbrc.kaust.edu.sa/ddpc/.
Collapse
Affiliation(s)
- Monique Maqungo
- South African National Bioinformatics Institute, University of the Western Cape, Private Bag-X17, Modderdam Road, Bellville, Cape Town, South Africa
| | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Katritzky AR, Kuanar M, Slavov S, Hall CD, Karelson M, Kahn I, Dobchev DA. Quantitative Correlation of Physical and Chemical Properties with Chemical Structure: Utility for Prediction. Chem Rev 2010; 110:5714-89. [DOI: 10.1021/cr900238d] [Citation(s) in RCA: 386] [Impact Index Per Article: 25.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Alan R. Katritzky
- Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, Gainesville, Florida 32611
| | - Minati Kuanar
- Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, Gainesville, Florida 32611
| | - Svetoslav Slavov
- Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, Gainesville, Florida 32611
| | - C. Dennis Hall
- Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, Gainesville, Florida 32611
| | - Mati Karelson
- Institute of Chemistry, Tallinn University of Technology, Akadeemia tee 15, Tallinn 19086, Estonia, and MolCode, Ltd., Soola 8, Tartu 51013, Estonia
| | - Iiris Kahn
- Institute of Chemistry, Tallinn University of Technology, Akadeemia tee 15, Tallinn 19086, Estonia, and MolCode, Ltd., Soola 8, Tartu 51013, Estonia
| | - Dimitar A. Dobchev
- Institute of Chemistry, Tallinn University of Technology, Akadeemia tee 15, Tallinn 19086, Estonia, and MolCode, Ltd., Soola 8, Tartu 51013, Estonia
| |
Collapse
|
25
|
Hazai E, Hazai I, Demko L, Kovacs S, Malik D, Akli P, Hari P, Szeman J, Fenyvesi E, Benes E, Szente L, Bikadi Z. Cyclodextrin knowledgebase a web-based service managing CD-ligand complexation data. J Comput Aided Mol Des 2010; 24:713-7. [DOI: 10.1007/s10822-010-9368-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2010] [Accepted: 05/17/2010] [Indexed: 10/19/2022]
|
26
|
Seed M, Agius R. Further validation of computer-based prediction of chemical asthma hazard. Occup Med (Lond) 2009; 60:115-20. [PMID: 19955299 DOI: 10.1093/occmed/kqp168] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND There is no agreed protocol for the prediction of low molecular weight (LMW) respiratory sensitizers. This creates challenges for occupational physicians responsible for the health of workforces using novel chemicals and respiratory physicians investigating cases of occupational asthma caused by novel asthmagens. AIMS To iterate the external validation of a previously published quantitative structure-activity relationship (QSAR) model for the prediction of novel chemical respiratory sensitizers and to better characterize its predictive accuracy. METHODS An external validation set of control chemicals was identified from the Australian Hazardous Substances Information System. An external validation set of asthmagenic chemicals was identified by a thorough search of the peer-reviewed literature from January 1995 onwards using the Medline database. The QSAR model was used to determine an 'asthma hazard index' (between 0 and 1) for each chemical. RESULTS A total of 28 external validation asthmagens and 129 control chemicals were identified. The area under the receiver operating characteristic (ROC) curve for the model's ability to distinguish asthmagens from controls was 0.87 (95% CI 0.76-0.97). Using a cut-off hazard index of 0.5 resulted in sensitivity of 79% and specificity of 93%. For prior probability ranging from 1:300 to 1:100, the negative predictive value (NPV) was 1 and positive predictive value (PPV) 0.04-0.1 while for prior probability ranging from 1:20 to 1:3, the NPV was 0.91-0.99 and PPV 0.39-0.85. CONCLUSIONS The ROC curve for this QSAR demonstrates good global predictive power for distinguishing asthmagenic from non-asthmagenic LMW organic compounds. Potential for utilization by occupational and respiratory physicians is evident from its predictive values.
Collapse
Affiliation(s)
- Martin Seed
- Occupational and Environmental Health Research Group, University of Manchester, Room C4.13, Ellen Wilkinson Building, Oxford Road, Manchester M13 9PL, UK.
| | | |
Collapse
|
27
|
Park J, Rosania GR, Saitou K. Tunable machine vision-based strategy for automated annotation of chemical databases. J Chem Inf Model 2009; 49:1993-2001. [PMID: 19621901 PMCID: PMC2907084 DOI: 10.1021/ci900029v] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We present a tunable, machine vision-based strategy for automated annotation of virtual small molecule databases. The proposed strategy is based on the use of a machine vision-based tool for extracting structure diagrams in research articles and converting them into connection tables, a virtual "Chemical Expert" system for screening the converted structures based on the adjustable levels of estimated conversion accuracy, and a fragment-based measure for calculating intermolecular similarity. For annotation, calculated chemical similarity between the converted structures and entries in a virtual small molecule database is used to establish the links. The overall annotation performances can be tuned by adjusting the cutoff threshold of the estimated conversion accuracy. We perform an annotation test which attempts to link 121 journal articles registered in PubMed to entries in PubChem which is the largest, publicly accessible chemical database. Two cases of tests are performed, and their results are compared to see how the overall annotation performances are affected by the different threshold levels of the estimated accuracy of the converted structure. Our work demonstrates that over 45% of the articles could have true positive links to entries in the PubChem database with promising recall and precision rates in both tests. Furthermore, we illustrate that the Chemical Expert system which can screen converted structures based on the adjustable levels of estimated conversion accuracy is a key factor impacting the overall annotation performance. We propose that this machine vision-based strategy can be incorporated with the text-mining approach to facilitate extraction of contextual scientific knowledge about a chemical structure, from the scientific literature.
Collapse
Affiliation(s)
- Jungkap Park
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, Michigan 48109, ,
| | - Gus R. Rosania
- Department of Pharmaceutical Sciences, University of Michigan, Ann Arbor, Michigan 48109,
| | - Kazuhiro Saitou
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, Michigan 48109, ,
| |
Collapse
|
28
|
In silico platform for xenobiotics ADME-T pharmacological properties modeling and prediction. Part I: Beyond the reduction of animal model use. Drug Discov Today 2009; 14:401-5. [PMID: 19340929 DOI: 10.1016/j.drudis.2009.01.009] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
There is an urgent need for efficient in silico ADME-T prediction tools for the selection of potent therapeutic drugs as well as the elimination of toxic compounds. This is particularly important in view of the high costs and ethical issues inherent to the use of animal models for drugs filtering. To achieve this mission, not only does the accuracy of in silico tools need to be improved, but also new experts in the field with skills in theoretical chemistry, clinical and fundamental biology have to be trained. Similarly, clinical biologists committed to the obligation of means and legally responsible for the results they generate could establish a legal framework that defines legal responsibilities when performing in silico predictions.
Collapse
|
29
|
Southan C, Várkonyi P, Muresan S. Quantitative assessment of the expanding complementarity between public and commercial databases of bioactive compounds. J Cheminform 2009; 1:10. [PMID: 20298516 PMCID: PMC3225862 DOI: 10.1186/1758-2946-1-10] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2009] [Accepted: 07/06/2009] [Indexed: 01/30/2023] Open
Abstract
BACKGROUND Since 2004 public cheminformatic databases and their collective functionality for exploring relationships between compounds, protein sequences, literature and assay data have advanced dramatically. In parallel, commercial sources that extract and curate such relationships from journals and patents have also been expanding. This work updates a previous comparative study of databases chosen because of their bioactive content, availability of downloads and facility to select informative subsets. RESULTS Where they could be calculated, extracted compounds-per-journal article were in the range of 12 to 19 but compound-per-protein counts increased with document numbers. Chemical structure filtration to facilitate standardised comparisons typically reduced source counts by between 5% and 30%. The pair-wise overlaps between 23 databases and subsets were determined, as well as changes between 2006 and 2008. While all compound sets have increased, PubChem has doubled to 14.2 million. The 2008 comparison matrix shows not only overlap but also unique content across all sources. Many of the detailed differences could be attributed to individual strategies for data selection and extraction. While there was a big increase in patent-derived structures entering PubChem since 2006, GVKBIO contains over 0.8 million unique structures from this source. Venn diagrams showed extensive overlap between compounds extracted by independent expert curation from journals by GVKBIO, WOMBAT (both commercial) and BindingDB (public) but each included unique content. In contrast, the approved drug collections from GVKBIO, MDDR (commercial) and DrugBank (public) showed surprisingly low overlap. Aggregating all commercial sources established that while 1 million compounds overlapped with PubChem 1.2 million did not. CONCLUSION On the basis of chemical structure content per se public sources have covered an increasing proportion of commercial databases over the last two years. However, commercial products included in this study provide links between compounds and information from patents and journals at a larger scale than current public efforts. They also continue to capture a significant proportion of unique content. Our results thus demonstrate not only an encouraging overall expansion of data-supported bioactive chemical space but also that both commercial and public sources are complementary for its exploration.
Collapse
Affiliation(s)
- Christopher Southan
- ChrisDS Consulting, S-42166, Göteborg, Sweden
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Péter Várkonyi
- DECS Global Compound Sciences, Computational Chemistry, AstraZeneca R&D Mölndal, S-43183 Mölndal, Sweden
| | - Sorel Muresan
- DECS Global Compound Sciences, Computational Chemistry, AstraZeneca R&D Mölndal, S-43183 Mölndal, Sweden
| |
Collapse
|
30
|
Shigemizu D, Araki M, Okuda S, Goto S, Kanehisa M. Extraction and analysis of chemical modification patterns in drug development. J Chem Inf Model 2009; 49:1122-9. [PMID: 19391632 DOI: 10.1021/ci8003804] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Most drugs have been continuously modified from prototypic compounds in the drug development process. Such chemical modifications in the history of drug development are expected to contain a wealth of medicinal chemists' knowledge, and the KEGG DRUG structure maps have been compiled to capture this knowledge. Here we attempted to extract the information on the chemical modification patterns from 3745 approved drugs in the KEGG DRUG database and 255 drug pairs in the KEGG DRUG structure maps. We first identified 236 core structures and 506 peripheral fragments from the KEGG DRUG database using bit-represented fingerprints and hierarchical clustering of similar structures. We then examined position-dependent relationships between core structures and peripheral fragments, which revealed the tendency of specific fragments connected to specific modification sites on the core structures. Next we converted the drug pairs into 204 peripheral fragment changes at the modification sites. Each change was represented by the transformation profile defined as a difference of fingerprint bit patterns, and the hierarchical clustering of similar transformation profiles was performed. We thus identified 125 chemical modification patterns that characterize the KEGG DRUG structure maps. These patterns were further applied to the reconstruction of a new structure map. The approach presented here may be applicable to systematic in silico drug modifications.
Collapse
Affiliation(s)
- Daichi Shigemizu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan
| | | | | | | | | |
Collapse
|
31
|
Domańska U, Pobudkowska A, Pelczarska A, Gierycz P. pKa and Solubility of Drugs in Water, Ethanol, and 1-Octanol. J Phys Chem B 2009; 113:8941-7. [DOI: 10.1021/jp900468w] [Citation(s) in RCA: 77] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Urszula Domańska
- Faculty of Chemistry, Physical Chemistry Division, Warsaw University of Technology, Noakowskiego 3, 00-664 Warsaw, Poland, and Institute of Physical Chemistry, Polish Academy of Science, Kasprzaka 44/52, 01-224 Warsaw, Poland
| | - Aneta Pobudkowska
- Faculty of Chemistry, Physical Chemistry Division, Warsaw University of Technology, Noakowskiego 3, 00-664 Warsaw, Poland, and Institute of Physical Chemistry, Polish Academy of Science, Kasprzaka 44/52, 01-224 Warsaw, Poland
| | - Aleksandra Pelczarska
- Faculty of Chemistry, Physical Chemistry Division, Warsaw University of Technology, Noakowskiego 3, 00-664 Warsaw, Poland, and Institute of Physical Chemistry, Polish Academy of Science, Kasprzaka 44/52, 01-224 Warsaw, Poland
| | - Paweł Gierycz
- Faculty of Chemistry, Physical Chemistry Division, Warsaw University of Technology, Noakowskiego 3, 00-664 Warsaw, Poland, and Institute of Physical Chemistry, Polish Academy of Science, Kasprzaka 44/52, 01-224 Warsaw, Poland
| |
Collapse
|
32
|
Song CM, Lim SJ, Tong JC. Recent advances in computer-aided drug design. Brief Bioinform 2009; 10:579-91. [PMID: 19433475 DOI: 10.1093/bib/bbp023] [Citation(s) in RCA: 175] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Modern drug discovery is characterized by the production of vast quantities of compounds and the need to examine these huge libraries in short periods of time. The need to store, manage and analyze these rapidly increasing resources has given rise to the field known as computer-aided drug design (CADD). CADD represents computational methods and resources that are used to facilitate the design and discovery of new therapeutic solutions. Digital repositories, containing detailed information on drugs and other useful compounds, are goldmines for the study of chemical reactions capabilities. Design libraries, with the potential to generate molecular variants in their entirety, allow the selection and sampling of chemical compounds with diverse characteristics. Fold recognition, for studying sequence-structure homology between protein sequences and structures, are helpful for inferring binding sites and molecular functions. Virtual screening, the in silico analog of high-throughput screening, offers great promise for systematic evaluation of huge chemical libraries to identify potential lead candidates that can be synthesized and tested. In this article, we present an overview of the most important data sources and computational methods for the discovery of new molecular entities. The workflow of the entire virtual screening campaign is discussed, from data collection through to post-screening analysis.
Collapse
Affiliation(s)
- Chun Meng Song
- Institute for Infocomm Research, Connexis South Tower, Singapore 138632
| | | | | |
Collapse
|
33
|
Sperandio O, Petitjean M, Tuffery P. wwLigCSRre: a 3D ligand-based server for hit identification and optimization. Nucleic Acids Res 2009; 37:W504-9. [PMID: 19429687 PMCID: PMC2703967 DOI: 10.1093/nar/gkp324] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The wwLigCSRre web server performs ligand-based screening using a 3D molecular similarity engine. Its aim is to provide an online versatile facility to assist the exploration of the chemical similarity of families of compounds, or to propose some scaffold hopping from a query compound. The service allows the user to screen several chemically diversified focused banks, such as Kinase-, CNS-, GPCR-, Ion-channel-, Antibacterial-, Anticancer- and Analgesic-focused libraries. The server also provides the possibility to screen the DrugBank and DSSTOX/Carcinogenic compounds databases. User banks can also been downloaded. The 3D similarity search combines both geometrical (3D) and physicochemical information. Starting from one 3D ligand molecule as query, the screening of such databases can lead to unraveled compound scaffold as hits or help to optimize previously identified hit molecules in a SAR (Structure activity relationship) project. wwLigCSRre can be accessed at http://bioserv.rpbs.univ-paris-diderot.fr/wwLigCSRre.html.
Collapse
Affiliation(s)
- O Sperandio
- MTi, INSERM UMR-S973, Université Paris Diderot - Paris 7, F75013, Paris, France
| | | | | |
Collapse
|
34
|
Dobson PD, Patel Y, Kell DB. ‘Metabolite-likeness’ as a criterion in the design and selection of pharmaceutical drug libraries. Drug Discov Today 2009; 14:31-40. [DOI: 10.1016/j.drudis.2008.10.011] [Citation(s) in RCA: 101] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2008] [Revised: 10/14/2008] [Accepted: 10/21/2008] [Indexed: 10/21/2022]
|
35
|
Vert JP, Jacob L. Machine learning for in silico virtual screening and chemical genomics: new strategies. Comb Chem High Throughput Screen 2008; 11:677-85. [PMID: 18795887 PMCID: PMC2748698 DOI: 10.2174/138620708785739899] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Support vector machines and kernel methods belong to the same class of machine learning algorithms that has recently become prominent in both computational biology and chemistry, although both fields have largely ignored each other. These methods are based on a sound mathematical and computationally efficient framework that implicitly embeds the data of interest, respectively proteins and small molecules, in high-dimensional feature spaces where various classification or regression tasks can be performed with linear algorithms. In this review, we present the main ideas underlying these approaches, survey how both the “biological” and the “chemical” spaces have been separately constructed using the same mathematical framework and tricks, and suggest different avenues to unify both spaces for the purpose of in silico chemogenomics.
Collapse
Affiliation(s)
- Jean-Philippe Vert
- Centre for Computational Biology, Mines ParisTech, 35 rue, Saint-Honoré, France.
| | | |
Collapse
|
36
|
Moda TL, Torres LG, Carrara AE, Andricopulo AD. PK/DB: database for pharmacokinetic properties and predictive in silico ADME models. Bioinformatics 2008; 24:2270-1. [PMID: 18684738 DOI: 10.1093/bioinformatics/btn415] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
UNLABELLED The study of pharmacokinetic properties (PK) is of great importance in drug discovery and development. In the present work, PK/DB (a new freely available database for PK) was designed with the aim of creating robust databases for pharmacokinetic studies and in silico absorption, distribution, metabolism and excretion (ADME) prediction. Comprehensive, web-based and easy to access, PK/DB manages 1203 compounds which represent 2973 pharmacokinetic measurements, including five models for in silico ADME prediction (human intestinal absorption, human oral bioavailability, plasma protein binding, blood-brain barrier and water solubility). AVAILABILITY http://www.pkdb.ifsc.usp.br
Collapse
Affiliation(s)
- Tiago L Moda
- Laboratory of Computational and Medicinal Chemistry, Center for Structural Molecular Biotechnology, Institute of Physics of São Carlos, University of São Paulo, São Carlos-SP 13566-970, Brazil
| | | | | | | |
Collapse
|
37
|
Wishart DS. Introduction to cheminformatics. CURRENT PROTOCOLS IN BIOINFORMATICS 2008; Chapter 14:Unit 14.1. [PMID: 18428788 DOI: 10.1002/0471250953.bi1401s18] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Cheminformatics is a relatively new field of information technology that focuses on the collection, storage, analysis, and manipulation of chemical data. The chemical data of interest typically includes information on small molecule formulas, structures, properties, spectra, and activities (biological or industrial). Cheminformatics originally emerged as a vehicle to help the drug discovery and development process, however cheminformatics now plays an increasingly important role in many areas of biology, chemistry, and biochemistry. The intent of this unit is to give readers some introduction into the field of cheminformatics and to show how cheminformatics not only shares many similarities with the field of bioinformatics, but that it can also enhance much of what is currently done in bioinformatics.
Collapse
|
38
|
Sauton N, Lagorce D, Villoutreix BO, Miteva MA. MS-DOCK: accurate multiple conformation generator and rigid docking protocol for multi-step virtual ligand screening. BMC Bioinformatics 2008; 9:184. [PMID: 18402678 PMCID: PMC2373571 DOI: 10.1186/1471-2105-9-184] [Citation(s) in RCA: 78] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2007] [Accepted: 04/10/2008] [Indexed: 11/21/2022] Open
Abstract
Background The number of protein targets with a known or predicted tri-dimensional structure and of drug-like chemical compounds is growing rapidly and so is the need for new therapeutic compounds or chemical probes. Performing flexible structure-based virtual screening computations on thousands of targets with millions of molecules is intractable to most laboratories nor indeed desirable. Since shape complementarity is of primary importance for most protein-ligand interactions, we have developed a tool/protocol based on rigid-body docking to select compounds that fit well into binding sites. Results Here we present an efficient multiple conformation rigid-body docking approach, MS-DOCK, which is based on the program DOCK. This approach can be used as the first step of a multi-stage docking/scoring protocol. First, we developed and validated the Multiconf-DOCK tool that generates several conformers per input ligand. Then, each generated conformer (bioactives and 37970 decoys) was docked rigidly using DOCK6 with our optimized protocol into seven different receptor-binding sites. MS-DOCK was able to significantly reduce the size of the initial input library for all seven targets, thereby facilitating subsequent more CPU demanding flexible docking procedures. Conclusion MS-DOCK can be easily used for the generation of multi-conformer libraries and for shape-based filtering within a multi-step structure-based screening protocol in order to shorten computation times.
Collapse
Affiliation(s)
- Nicolas Sauton
- INSERM, U648, 45 rue des Sts Peres, University Paris Descartes, 75006 Paris, France.
| | | | | | | |
Collapse
|
39
|
Ioakimidis L, Thoukydidis L, Mirza A, Naeem S, Reynisson J. Benchmarking the Reliability of QikProp. Correlation between Experimental and Predicted Values. ACTA ACUST UNITED AC 2008. [DOI: 10.1002/qsar.200730051] [Citation(s) in RCA: 163] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
|
40
|
Senger S, Leach AR. SAR Knowledge Bases in Drug Discovery. ACTA ACUST UNITED AC 2008. [DOI: 10.1016/s1574-1400(08)00011-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2023]
|
41
|
Moda TL, Montanari CA, Andricopulo AD. Hologram QSAR model for the prediction of human oral bioavailability. Bioorg Med Chem 2007; 15:7738-45. [PMID: 17870541 DOI: 10.1016/j.bmc.2007.08.060] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2007] [Revised: 08/22/2007] [Accepted: 08/28/2007] [Indexed: 11/20/2022]
Abstract
A drug intended for use in humans should have an ideal balance of pharmacokinetics and safety, as well as potency and selectivity. Unfavorable pharmacokinetics can negatively affect the clinical development of many otherwise promising drug candidates. A variety of in silico ADME (absorption, distribution, metabolism, and excretion) models are receiving increased attention due to a better appreciation that pharmacokinetic properties should be considered in early phases of the drug discovery process. Human oral bioavailability is an important pharmacokinetic property, which is directly related to the amount of drug available in the systemic circulation to exert pharmacological and therapeutic effects. In the present work, hologram quantitative structure-activity relationships (HQSAR) were performed on a training set of 250 structurally diverse molecules with known human oral bioavailability. The most significant HQSAR model (q(2)=0.70, r(2)=0.93) was obtained using atoms, bond, connection, and chirality as fragment distinction. The predictive ability of the model was evaluated by an external test set containing 52 molecules not included in the training set, and the predicted values were in good agreement with the experimental values. The HQSAR model should be useful for the design of new drug candidates having increased bioavailability as well as in the process of chemical library design, virtual screening, and high-throughput screening.
Collapse
Affiliation(s)
- Tiago L Moda
- Laboratório de Química Medicinal e Computacional, Centro de Biotecnologia Molecular Estrutural, Instituto de Física de São Carlos, Universidade de São Paulo, 13566-970 São Carlos, SP, Brazil
| | | | | |
Collapse
|
42
|
Wallqvist A, Huang R, Covell DG. Chemoinformatic analysis of NCI preclinical tumor data: evaluating compound efficacy from mouse xenograft data, NCI-60 screening data, and compound descriptors. J Chem Inf Model 2007; 47:1414-27. [PMID: 17555311 DOI: 10.1021/ci700132u] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We provide a chemoinformatic examination of the NCI public human tumor xenograft data to explore relationships between small molecules, treatment modality, efficacy, and toxicity. Efficacy endpoints of tumor weight reduction (TW) and survival time increase (ST) compared to tumor bearing control mice were augmented by a toxicity measure, defined as the survival advantage of treated versus control animals (TX). These endpoints were used to define two independent therapeutic indices (TIs) as the ratio of efficacy (TW or ST) to toxicity (TX). Linear models predictive of xenograft endpoints were successfully constructed (0.67 < r(2) < or = 0.74)(observed_versus_predicted) using a model comprised of variables in treatment modality, chemoinformatic descriptors, and in vitro cell growth inhibition in the NCI 60-cell assay. Cross-validation analysis based on randomly chosen training subsets found these predictive correlations to be robust. Model-based sensitivity analysis found chemistry and growth inhibition to provide the best, and treatment modality the worst, indicators of xenograft endpoint. The poor predictive power derived from treatment alone appears to be of less importance to xenograft outcome for compounds having strongly similar chemical and biological features. ROC-based model validation found a 70% positive predictive value for distinguishing FDA approved oncology agents from available xenograft tested compounds. Additional chemoinformatic applications are provided that relate xenograft outcome to biological pathways and putative mechanism of compound action. These results find a strong relationship between xenograft efficacy and pathways comprised of genes having highly correlated mRNA expressions. Our analysis demonstrates that chemoinformatic studies utilizing a combination of xenograft data and in vitro preclinical testing offer an effective means to identify compound classes with superior efficacy and reduced toxicity.
Collapse
Affiliation(s)
- Anders Wallqvist
- Laboratory of Computational Technologies, SAIC-Frederick, Inc., NCI-Frederick, Frederick, Maryland 21702, USA.
| | | | | |
Collapse
|
43
|
Chen Y, Monshouwer M, Fitch WL. Analytical Tools and Approaches for Metabolite Identification in Early Drug Discovery. Pharm Res 2006; 24:248-57. [PMID: 17048114 DOI: 10.1007/s11095-006-9162-7] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2006] [Accepted: 06/07/2006] [Indexed: 11/30/2022]
Abstract
Determination of the chemical structures of metabolites is a critical part of the early pharmaceutical discovery process. Understanding the structures of metabolites is useful both for optimizing the metabolic stability of a drug as well as rationalizing the drug safety profile. This review describes the current state of the art in this endeavor. The likely outcome of metabolism is first predicted by comparison to the literature. Then metabolites are synthesized in a variety of in vitro systems. The various approaches to LC/UV/MS are applied to learn information about these metabolites and structure hypotheses are made. Structures are confirmed by synthesis or NMR. The special topic of reactive metabolite structure determination is briefly addressed.
Collapse
Affiliation(s)
- Yuan Chen
- Drug Metabolism and Pharmacokinetics, Roche Palo Alto, 3431 Hillview Ave., Palo Alto, California 94304, USA
| | | | | |
Collapse
|
44
|
Lu Y, Freeland S. Testing the potential for computational chemistry to quantify biophysical properties of the non-proteinaceous amino acids. ASTROBIOLOGY 2006; 6:606-24. [PMID: 16916286 DOI: 10.1089/ast.2006.6.606] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Although most proteins of most living organisms are constructed from the same set of 20 amino acids, all indications are that this standard alphabet represents a mere subset of what was available to life during early evolution. However, we currently lack an appropriate quantitative framework with which to test the qualitative hypotheses that have been offered to date as explanations for nature's "choices." Specifically, although many indices have been developed to describe the 20 standard amino acids, few or no comparable data extend to prebiotically plausible alternatives because of the costly and time-consuming bench experiments that would be required. Computational chemistry (specifically quantitative structure property relationship methods) offers a potentially fast, cost-effective remedy for this knowledge gap by predicting such molecular properties in silico. Thus, we investigated the use of various freely accessible programs to predict three key amino acid properties (hydrophobicity, charge, and size). We assessed the accuracy of these predictions by comparisons with experimentally determined counterparts for appropriate test data sets. In light of these results, and factors of software accessibility and transparency, we suggest a method for further computational assessments of prebiotically plausible amino acids. The results serve as a starting point for future quantitative analysis of amino acid alphabet evolution.
Collapse
Affiliation(s)
- Yi Lu
- Department of Biological Sciences, University of Maryland, Baltimore County, Baltimore, Maryland 21250, USA
| | | |
Collapse
|
45
|
La Clair JJ. Cellular routines in the synthesis of cyclic peptide probes. Tetrahedron 2006. [DOI: 10.1016/j.tet.2006.01.113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
46
|
Refsgaard HHF, Jensen BF, Christensen IT, Hagen N, Brockhoff PB. In silico prediction of cytochrome P450 inhibitors. Drug Dev Res 2006. [DOI: 10.1002/ddr.20108] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
|
47
|
Abstract
The development of on-line software tools is changing the way we traditionally perform our analysis in drug design, but will chemoinformatics be forever behind bioinformatics in this development?
Collapse
Affiliation(s)
- Igor V Tetko
- Institute for Bioinformatics, GSF, Forschungszentrum fuer Umwelt und Gesundheit, Ingolstaedter Landstrasse 1, D-85764 Neuherberg, Germany.
| |
Collapse
|
48
|
Abstract
Increased availability of large repositories of chemical compounds is creating new challenges and opportunities for the application of machine learning methods to problems in computational chemistry and chemical informatics. Because chemical compounds are often represented by the graph of their covalent bonds, machine learning methods in this domain must be capable of processing graphical structures with variable size. Here, we first briefly review the literature on graph kernels and then introduce three new kernels (Tanimoto, MinMax, Hybrid) based on the idea of molecular fingerprints and counting labeled paths of depth up to d using depth-first search from each possible vertex. The kernels are applied to three classification problems to predict mutagenicity, toxicity, and anti-cancer activity on three publicly available data sets. The kernels achieve performances at least comparable, and most often superior, to those previously reported in the literature reaching accuracies of 91.5% on the Mutag dataset, 65-67% on the PTC (Predictive Toxicology Challenge) dataset, and 72% on the NCI (National Cancer Institute) dataset. Properties and tradeoffs of these kernels, as well as other proposed kernels that leverage 1D or 3D representations of molecules, are briefly discussed.
Collapse
Affiliation(s)
- Liva Ralaivola
- School of Information and Computer Sciences, University of California, Irvine, CA 92697-3425, USA
| | | | | | | |
Collapse
|
49
|
Abstract
The use of genomics to improve molecular strategies in safety assessment has immense promise, with increased mechanistic understanding and improved prediction of unknown compounds possible. Several public initiatives in toxicogenomics are now underway, and mechanistic findings are clearly emerging. A number of databases and standards are emerging to support these initiatives. Significant attention to standardization, both for biologic and technical issues, will be necessary for effective community database(s) to be fully operational.
Collapse
Affiliation(s)
- A Hugh Salter
- Department of Molecular Sciences, AstraZeneca R&D, Södertälje, S-151 87 Södertälje, Sweden.
| |
Collapse
|
50
|
Chen J, Swamidass SJ, Dou Y, Bruand J, Baldi P. ChemDB: a public database of small molecules and related chemoinformatics resources. ACTA ACUST UNITED AC 2005; 21:4133-9. [PMID: 16174682 DOI: 10.1093/bioinformatics/bti683] [Citation(s) in RCA: 94] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION The development of chemoinformatics has been hampered by the lack of large, publicly available, comprehensive repositories of molecules, in particular of small molecules. Small molecules play a fundamental role in organic chemistry and biology. They can be used as combinatorial building blocks for chemical synthesis, as molecular probes in chemical genomics and systems biology, and for the screening and discovery of new drugs and other useful compounds. RESULTS We describe ChemDB, a public database of small molecules available on the Web. ChemDB is built using the digital catalogs of over a hundred vendors and other public sources and is annotated with information derived from these sources as well as from computational methods, such as predicted solubility and three-dimensional structure. It supports multiple molecular formats and is periodically updated, automatically whenever possible. The current version of the database contains approximately 4.1 million commercially available compounds and 8.2 million counting isomers. The database includes a user-friendly graphical interface, chemical reactions capabilities, as well as unique search capabilities. AVAILABILITY Database and datasets are available on http://cdb.ics.uci.edu.
Collapse
Affiliation(s)
- Jonathan Chen
- Institute for Genomics and Bioinformatics, University of California, Irvine, USA
| | | | | | | | | |
Collapse
|