51
|
Thafar M, Raies AB, Albaradei S, Essack M, Bajic VB. Comparison Study of Computational Prediction Tools for Drug-Target Binding Affinities. Front Chem 2019; 7:782. [PMID: 31824921 PMCID: PMC6879652 DOI: 10.3389/fchem.2019.00782] [Citation(s) in RCA: 75] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Accepted: 10/30/2019] [Indexed: 12/30/2022] Open
Abstract
The drug development is generally arduous, costly, and success rates are low. Thus, the identification of drug-target interactions (DTIs) has become a crucial step in early stages of drug discovery. Consequently, developing computational approaches capable of identifying potential DTIs with minimum error rate are increasingly being pursued. These computational approaches aim to narrow down the search space for novel DTIs and shed light on drug functioning context. Most methods developed to date use binary classification to predict if the interaction between a drug and its target exists or not. However, it is more informative but also more challenging to predict the strength of the binding between a drug and its target. If that strength is not sufficiently strong, such DTI may not be useful. Therefore, the methods developed to predict drug-target binding affinities (DTBA) are of great value. In this study, we provide a comprehensive overview of the existing methods that predict DTBA. We focus on the methods developed using artificial intelligence (AI), machine learning (ML), and deep learning (DL) approaches, as well as related benchmark datasets and databases. Furthermore, guidance and recommendations are provided that cover the gaps and directions of the upcoming work in this research area. To the best of our knowledge, this is the first comprehensive comparison analysis of tools focused on DTBA with reference to AI/ML/DL.
Collapse
Affiliation(s)
- Maha Thafar
- Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- College of Computers and Information Technology, Taif University, Taif, Saudi Arabia
| | - Arwa Bin Raies
- Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Somayah Albaradei
- Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Magbubah Essack
- Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Vladimir B. Bajic
- Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| |
Collapse
|
52
|
Vamathevan J, Clark D, Czodrowski P, Dunham I, Ferran E, Lee G, Li B, Madabhushi A, Shah P, Spitzer M, Zhao S. Applications of machine learning in drug discovery and development. Nat Rev Drug Discov 2019; 18:463-477. [PMID: 30976107 DOI: 10.1038/s41573-019-0024-5] [Citation(s) in RCA: 1170] [Impact Index Per Article: 195.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Drug discovery and development pipelines are long, complex and depend on numerous factors. Machine learning (ML) approaches provide a set of tools that can improve discovery and decision making for well-specified questions with abundant, high-quality data. Opportunities to apply ML occur in all stages of drug discovery. Examples include target validation, identification of prognostic biomarkers and analysis of digital pathology data in clinical trials. Applications have ranged in context and methodology, with some approaches yielding accurate predictions and insights. The challenges of applying ML lie primarily with the lack of interpretability and repeatability of ML-generated results, which may limit their application. In all areas, systematic and comprehensive high-dimensional data still need to be generated. With ongoing efforts to tackle these issues, as well as increasing awareness of the factors needed to validate ML approaches, the application of ML can promote data-driven decision making and has the potential to speed up the process and reduce failure rates in drug discovery and development.
Collapse
Affiliation(s)
- Jessica Vamathevan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK.
| | - Dominic Clark
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | | | - Ian Dunham
- Open Targets and European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Edgardo Ferran
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - George Lee
- Bristol-Myers Squibb, Princeton, NJ, USA
| | - Bin Li
- Takeda Pharmaceuticals International Co., Cambridge, MA, USA
| | - Anant Madabhushi
- Case Western Reserve University, Cleveland, OH, USA.,Louis Stokes Cleveland Veterans Affair Medical Center, Cleveland, OH, USA
| | | | - Michaela Spitzer
- Open Targets and European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Shanrong Zhao
- Pfizer Worldwide Research and Development, Cambridge, MA, USA
| |
Collapse
|
53
|
Liu B, He H, Luo H, Zhang T, Jiang J. Artificial intelligence and big data facilitated targeted drug discovery. Stroke Vasc Neurol 2019; 4:206-213. [PMID: 32030204 PMCID: PMC6979871 DOI: 10.1136/svn-2019-000290] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Accepted: 10/28/2019] [Indexed: 12/20/2022] Open
Abstract
Different kinds of biological databases publicly available nowadays provide us a goldmine of multidiscipline big data. The Cancer Genome Atlas is a cancer database including detailed information of many patients with cancer. DrugBank is a database including detailed information of approved, investigational and withdrawn drugs, as well as other nutraceutical and metabolite structures. PubChem is a chemical compound database including all commercially available compounds as well as other synthesisable compounds. Protein Data Bank is a crystal structure database including X-ray, cryo-EM and nuclear magnetic resonance protein three-dimensional structures as well as their ligands. On the other hand, artificial intelligence (AI) is playing an important role in the drug discovery progress. The integration of such big data and AI is making a great difference in the discovery of novel targeted drug. In this review, we focus on the currently available advanced methods for the discovery of highly effective lead compounds with great absorption, distribution, metabolism, excretion and toxicity properties.
Collapse
Affiliation(s)
- Benquan Liu
- Jiangsu Key Lab of Drug Screening, China Pharmaceutical University, Nanjing, China
| | - Huiqin He
- Jiangsu Key Lab of Drug Screening, China Pharmaceutical University, Nanjing, China
| | - Hongyi Luo
- Jiangsu Key Lab of Drug Screening, China Pharmaceutical University, Nanjing, China
| | - Tingting Zhang
- Jiangsu Key Lab of Drug Screening, China Pharmaceutical University, Nanjing, China
| | - Jingwei Jiang
- Jiangsu Key Lab of Drug Screening, China Pharmaceutical University, Nanjing, China
| |
Collapse
|
54
|
Koromina M, Pandi MT, Patrinos GP. Rethinking Drug Repositioning and Development with Artificial Intelligence, Machine Learning, and Omics. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2019; 23:539-548. [PMID: 31651216 DOI: 10.1089/omi.2019.0151] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Pharmaceutical industry and the art and science of drug development are sorely in need of novel transformative technologies in the current age of digital health and artificial intelligence (AI). Often described as game-changing technologies, AI and machine learning algorithms have slowly but surely begun to revolutionize pharmaceutical industry and drug development over the past 5 years. In this expert review, we describe the most frequently used machine learning algorithms in drug development pipelines and the -omics databases well poised to support machine learning and drug discovery. Subsequently, we analyze the emerging new computational approaches to drug discovery and the in silico pipelines for drug repositioning and the synergies among -omics system sciences, AI and machine learning. As with system sciences, AI and machine learning embody a system scale and Big Data driven vision for drug discovery and development. We conclude with a future outlook on the ways in which machine learning approaches can be implemented to buttress and expedite drug discovery and precision medicine. As AI and machine learning are rapidly entering pharmaceutical industry and the art and science of drug development, we need to critically examine the attendant prospects and challenges to benefit patients and public health.
Collapse
Affiliation(s)
- Maria Koromina
- Laboratory of Pharmacogenomics and Individualized Therapy, Department of Pharmacy, School of Health Sciences, University of Patras, Patras, Greece
| | - Maria-Theodora Pandi
- Laboratory of Pharmacogenomics and Individualized Therapy, Department of Pharmacy, School of Health Sciences, University of Patras, Patras, Greece
| | - George P Patrinos
- Laboratory of Pharmacogenomics and Individualized Therapy, Department of Pharmacy, School of Health Sciences, University of Patras, Patras, Greece.,Department of Pathology, College of Medicine and Health Sciences, United Arab Emirates University, Al-Ain, Abu Dhabi.,Zayed Center of Health Sciences, United Arab Emirates University, Al-Ain, Abu Dhabi
| |
Collapse
|
55
|
Tong Z, Zhou Y, Wang J. Identifying potential drug targets in hepatocellular carcinoma based on network analysis and one-class support vector machine. Sci Rep 2019; 9:10442. [PMID: 31320657 PMCID: PMC6639372 DOI: 10.1038/s41598-019-46540-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Accepted: 06/26/2019] [Indexed: 02/08/2023] Open
Abstract
Hepatocellular carcinoma (HCC) is one major cause of cancer-related death worldwide. But now, the systematic therapy for the advanced stages of HCC is rather limited. Thus, the discovery of novel drug targets and thereafter targeted drugs against HCC is continuously needed. In this study, we combined clinical association data, gene expression profiles and manually collected drug target genes with the human protein-protein interaction (PPI) network to establish an in-silico HCC drug target predictor. First, we found drug target genes (DTGs), disease-associated genes (DAGs), prognostic unfavorable genes (PUGs) and cancer up-regulated genes (URGs) have higher degree, betweenness, closeness centrality, while cancer down-regulated genes (DRGs), prognostic favorable genes (PFGs) have lower degrees, in comparison with background genes. Moreover, DTG nodes were shown to be closer to DAG, PUG and URG nodes, but farther away from PFG and DRG nodes. Compared to the background, PFGs and DRGs were shown to have relatively bigger genetic dependency scores, while PUGs and URGs have smaller genetic dependency scores. Finally, based on the observed features of DTGs, we constructed a drug target predictor using one-class support vector machine (one-class SVM). Performance evaluation results suggested our predictor could effectively identify putative drug target genes for further research.
Collapse
Affiliation(s)
- Zhan Tong
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing, 100191, China
| | - Yuan Zhou
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing, 100191, China.
| | - Juan Wang
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing, 100191, China.
| |
Collapse
|
56
|
Failli M, Paananen J, Fortino V. Prioritizing target-disease associations with novel safety and efficacy scoring methods. Sci Rep 2019; 9:9852. [PMID: 31285471 PMCID: PMC6614395 DOI: 10.1038/s41598-019-46293-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Accepted: 06/25/2019] [Indexed: 01/24/2023] Open
Abstract
Biological target (commonly genes or proteins) identification is still largely a manual process, where experts manually try to collect and combine information from hundreds of data sources, ranging from scientific publications to omics databases. Targeting the wrong gene or protein will lead to failure of the drug development process, as well as incur delays and costs. To improve this process, different software platforms are being developed. These platforms rely strongly on efficacy estimates based on target-disease association scores created by computational methods for drug target prioritization. Here novel computational methods are presented to more accurately evaluate the efficacy and safety of potential drug targets. The proposed efficacy scores utilize existing gene expression data and tissue/disease specific networks to improve the inference of target-disease associations. Conversely, safety scores enable the identification of genes that are essential, potentially susceptible to adverse effects or carcinogenic. Benchmark results demonstrate that our transcriptome-based methods for drug target prioritization can increase the true positive rate of target-disease associations. Additionally, the proposed safety evaluation system enables accurate predictions of targets of withdrawn drugs and targets of drug trials prematurely discontinued.
Collapse
Affiliation(s)
- Mario Failli
- Institute of Biomedicine, University of Eastern Finland, Kuopio, Finland
| | - Jussi Paananen
- Institute of Biomedicine, University of Eastern Finland, Kuopio, Finland
| | - Vittorio Fortino
- Institute of Biomedicine, University of Eastern Finland, Kuopio, Finland.
| |
Collapse
|
57
|
Florian P, Flechsenhar KR, Bartnik E, Ding‐Pfennigdorff D, Herrmann M, Bryce PJ, Nestle FO. Translational drug discovery and development with the use of tissue‐relevant biomarkers: Towards more physiological relevance and better prediction of clinical efficacy. Exp Dermatol 2019; 29:4-14. [DOI: 10.1111/exd.13942] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2018] [Revised: 02/28/2019] [Accepted: 03/26/2019] [Indexed: 12/13/2022]
Affiliation(s)
- Peter Florian
- Department of Type 1/17 Immunology and Arthritis Sanofi Frankfurt Germany
| | | | - Eckart Bartnik
- Department of Type 1/17 Immunology and Arthritis Sanofi Frankfurt Germany
| | | | - Matthias Herrmann
- Department of Type 1/17 Immunology and Arthritis Sanofi Frankfurt Germany
| | - Paul J. Bryce
- Department of Type 2 Inflammation and Fibrosis Sanofi Cambridge Massachusetts
| | - Frank O. Nestle
- Global Head of Immunology Therapeutic Research Area Sanofi Cambridge Massachusetts
| |
Collapse
|
58
|
Phenotypes associated with genes encoding drug targets are predictive of clinical trial side effects. Nat Commun 2019; 10:1579. [PMID: 30952858 PMCID: PMC6450952 DOI: 10.1038/s41467-019-09407-3] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2018] [Accepted: 03/07/2019] [Indexed: 12/19/2022] Open
Abstract
Only a small fraction of early drug programs progress to the market, due to safety and efficacy failures, despite extensive efforts to predict safety. Characterizing the effect of natural variation in the genes encoding drug targets should present a powerful approach to predict side effects arising from drugging particular proteins. In this retrospective analysis, we report a correlation between the organ systems affected by genetic variation in drug targets and the organ systems in which side effects are observed. Across 1819 drugs and 21 phenotype categories analyzed, drug side effects are more likely to occur in organ systems where there is genetic evidence of a link between the drug target and a phenotype involving that organ system, compared to when there is no such genetic evidence (30.0 vs 19.2%; OR = 1.80). This result suggests that human genetic data should be used to predict safety issues associated with drug targets. Safety issues including side effects are one of the major factors causing failure of clinical trials in drug development. Here, the authors leverage information about phenotypes associated with variation in genes encoding drug targets to predict drug-treatment-related side effects.
Collapse
|
59
|
Pizzorno A, Padey B, Terrier O, Rosa-Calatrava M. Drug Repurposing Approaches for the Treatment of Influenza Viral Infection: Reviving Old Drugs to Fight Against a Long-Lived Enemy. Front Immunol 2019; 10:531. [PMID: 30941148 PMCID: PMC6434107 DOI: 10.3389/fimmu.2019.00531] [Citation(s) in RCA: 66] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2018] [Accepted: 02/27/2019] [Indexed: 12/18/2022] Open
Abstract
Influenza viruses still constitute a real public health problem today. To cope with the emergence of new circulating strains, but also the emergence of resistant strains to classic antivirals, it is necessary to develop new antiviral approaches. This review summarizes the state-of-the-art of current antiviral options against influenza infection, with a particular focus on the recent advances of anti-influenza drug repurposing strategies and their potential therapeutic, regulatory and economic benefits. The review will illustrate the multiple ways to reposition molecules for the treatment of influenza, from adventitious discovery to in silico-based screening. These novel antiviral molecules, many of which targeting the host cell, in combination with conventional antiviral agents targeting the virus, will ideally enter the clinics and reinforce the therapeutic arsenal to combat influenza virus infections.
Collapse
|
60
|
Artificial intelligence in drug development: present status and future prospects. Drug Discov Today 2018; 24:773-780. [PMID: 30472429 DOI: 10.1016/j.drudis.2018.11.014] [Citation(s) in RCA: 306] [Impact Index Per Article: 43.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2018] [Revised: 10/14/2018] [Accepted: 11/19/2018] [Indexed: 12/11/2022]
Abstract
Artificial intelligence (AI) uses personified knowledge and learns from the solutions it produces to address not only specific but also complex problems. Remarkable improvements in computational power coupled with advancements in AI technology could be utilised to revolutionise the drug development process. At present, the pharmaceutical industry is facing challenges in sustaining their drug development programmes because of increased R&D costs and reduced efficiency. In this review, we discuss the major causes of attrition rates in new drug approvals, the possible ways that AI can improve the efficiency of the drug development process and collaboration of pharmaceutical industry giants with AI-powered drug discovery firms.
Collapse
|
61
|
Freudenberg JM, Dunham I, Sanseau P, Rajpal DK. Uncovering new disease indications for G-protein coupled receptors and their endogenous ligands. BMC Bioinformatics 2018; 19:345. [PMID: 30285606 PMCID: PMC6167889 DOI: 10.1186/s12859-018-2392-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2018] [Accepted: 09/23/2018] [Indexed: 11/29/2022] Open
Abstract
Background The Open Targets Platform integrates different data sources in order to facilitate identification of potential therapeutic drug targets to treat human diseases. It currently provides evidence for nearly 2.6 million potential target-disease pairs. G-protein coupled receptors are a drug target class of high interest because of the number of successful drugs being developed against them over many years. Here we describe a systematic approach utilizing the Open Targets Platform data to uncover and prioritize potential new disease indications for the G-protein coupled receptors and their ligands. Results Utilizing the data available in the Open Targets platform, potential G-protein coupled receptor and endogenous ligand disease association pairs were systematically identified. Intriguing examples such as GPR35 for inflammatory bowel disease and CXCR4 for viral infection are used as illustrations of how a systematic approach can aid in the prioritization of interesting drug discovery hypotheses. Combining evidences for G-protein coupled receptors and their corresponding endogenous peptidergic ligands increases confidence and provides supportive evidence for potential new target-disease hypotheses. Comparing such hypotheses to the global pharma drug discovery pipeline to validate the approach showed that more than 93% of G-protein coupled receptor-disease pairs with a high overall Open Targets score involved receptors with an existing drug discovery program. Conclusions The Open Targets gene-disease score can be used to prioritize potential G-protein coupled receptors-indication hypotheses. In addition, availability of multiple different evidence types markedly increases confidence as does combining evidence from known receptor-ligand pairs. Comparing the top-ranked hypotheses to the current global pharma pipeline serves validation of our approach and identifies and prioritizes new therapeutic opportunities. Electronic supplementary material The online version of this article (10.1186/s12859-018-2392-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Ian Dunham
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Philippe Sanseau
- Open Targets, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.,Computational Biology and Stats, Target Sciences, GSK Medicines Research Centre, Gunnels Wood Road, Stevenage, SG1 2NY, UK
| | - Deepak K Rajpal
- Computational Biology, Target Sciences, GlaxoSmithKline, Collegeville, PA, 19426, USA.
| |
Collapse
|
62
|
Valdebenito S, Lou E, Baldoni J, Okafo G, Eugenin E. The Novel Roles of Connexin Channels and Tunneling Nanotubes in Cancer Pathogenesis. Int J Mol Sci 2018; 19:E1270. [PMID: 29695070 PMCID: PMC5983846 DOI: 10.3390/ijms19051270] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2018] [Revised: 04/13/2018] [Accepted: 04/18/2018] [Indexed: 12/28/2022] Open
Abstract
Neoplastic growth and cellular differentiation are critical hallmarks of tumor development. It is well established that cell-to-cell communication between tumor cells and "normal" surrounding cells regulates tumor differentiation and proliferation, aggressiveness, and resistance to treatment. Nevertheless, the mechanisms that result in tumor growth and spread as well as the adaptation of healthy surrounding cells to the tumor environment are poorly understood. A major component of these communication systems is composed of connexin (Cx)-containing channels including gap junctions (GJs), tunneling nanotubes (TNTs), and hemichannels (HCs). There are hundreds of reports about the role of Cx-containing channels in the pathogenesis of cancer, and most of them demonstrate a downregulation of these proteins. Nonetheless, new data demonstrate that a localized communication via Cx-containing GJs, HCs, and TNTs plays a key role in tumor growth, differentiation, and resistance to therapies. Moreover, the type and downstream effects of signals communicated between the different populations of tumor cells are still unknown. However, new approaches such as artificial intelligence (AI) and machine learning (ML) could provide new insights into these signals communicated between connected cells. We propose that the identification and characterization of these new communication systems and their associated signaling could provide new targets to prevent or reduce the devastating consequences of cancer.
Collapse
Affiliation(s)
- Silvana Valdebenito
- Public Health Research Institute (PHRI), Newark, NJ 07103, USA.
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Rutgers the State University of NJ, Newark, NJ 07103, USA.
| | - Emil Lou
- Department of Medicine, Division of Hematology, Oncology and Transplantation, University of Minnesota, Minneapolis, MN 55455, USA.
| | - John Baldoni
- GlaxoSmithKline, In-Silico Drug Discovery Unit, 1250 South Collegeville Road, Collegeville, PA 19426, USA.
| | - George Okafo
- GlaxoSmithKline, In-Silico Drug Discovery Unit, Stevenage SG1 2NY, UK.
| | - Eliseo Eugenin
- Public Health Research Institute (PHRI), Newark, NJ 07103, USA.
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Rutgers the State University of NJ, Newark, NJ 07103, USA.
| |
Collapse
|
63
|
Brown N, Cambruzzi J, Cox PJ, Davies M, Dunbar J, Plumbley D, Sellwood MA, Sim A, Williams-Jones BI, Zwierzyna M, Sheppard DW. Big Data in Drug Discovery. PROGRESS IN MEDICINAL CHEMISTRY 2018; 57:277-356. [PMID: 29680150 DOI: 10.1016/bs.pmch.2017.12.003] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Interpretation of Big Data in the drug discovery community should enhance project timelines and reduce clinical attrition through improved early decision making. The issues we encounter start with the sheer volume of data and how we first ingest it before building an infrastructure to house it to make use of the data in an efficient and productive way. There are many problems associated with the data itself including general reproducibility, but often, it is the context surrounding an experiment that is critical to success. Help, in the form of artificial intelligence (AI), is required to understand and translate the context. On the back of natural language processing pipelines, AI is also used to prospectively generate new hypotheses by linking data together. We explain Big Data from the context of biology, chemistry and clinical trials, showcasing some of the impressive public domain sources and initiatives now available for interrogation.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Aaron Sim
- BenevolentAI, London, United Kingdom
| | | | - Magdalena Zwierzyna
- BenevolentAI, London, United Kingdom; Institute of Cardiovascular Science, University College London, London, United Kingdom
| | | |
Collapse
|