1
|
Pang L, He K, Zhang Y, Li P, Lin Y, Yue J. Predicting environmental risks of pharmaceutical residues by wastewater surveillance: An analysis based on pharmaceutical sales and their excretion data. Sci Total Environ 2024; 916:170204. [PMID: 38262535 DOI: 10.1016/j.scitotenv.2024.170204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 12/23/2023] [Accepted: 01/14/2024] [Indexed: 01/25/2024]
Abstract
Pharmaceutical residues are increasingly becoming a significant source of environmental water pollution and ecological risk. This study, leveraging official national pharmaceutical sales statistics, predicts the environmental concentrations of 33 typical pharmaceuticals in the Tianjin area. The results show that 52 % of the drugs have a PEC/MEC (Predicted Environmental Concentration/Measured Environmental Concentration) ratio within the acceptable range of 0.5-2, including atenolol (1.21), carbamazepine (1.22), and sulfamethoxazole (0.91). Among the selected drugs, tetracycline, ciprofloxacin, and acetaminophen had the highest predicted concentrations. The EPI (Estimation Programs Interface) biodegradation model, a tool from the US Environmental Protection Agency, is used to predict the removal efficiency of compounds in wastewater treatment plants. The results indicate that the EPI predictions are acceptable for macrolide antibiotics and β-blockers, with removal rates of roxithromycin, spiramycin, acetaminophen, and carbamazepine being 14.1 %, 61.2 %, 75.1 %, and 44.5 %, respectively. However, the model proved to be less effective for fluoroquinolone antibiotics. The ECOSAR (Ecological Structure-Activity Relationships) model was used to supplement the assessment of the potential impacts of drugs on aquatic ecosystems, further refining the analysis of pharmaceutical environmental risks. By combining the concentration and detection frequency of pharmaceutical wastewater, this study identified 9 drugs with significant toxicological risks and marked another 24 drugs as substances of potential concern. Additionally, this study provides data support for addressing pharmaceutical residues of priority concern in subsequent research.
Collapse
Affiliation(s)
- Lihao Pang
- College of Environmental Science and Safety Engineering, Tianjin University of Technology, Tianjin 300384, China
| | - Kai He
- College of Civil Engineering, Sun Yat-Sen University, Guangzhou 51000, China.
| | - Yuxuan Zhang
- College of Civil Engineering, Sun Yat-Sen University, Guangzhou 51000, China
| | - Penghui Li
- College of Environmental Science and Safety Engineering, Tianjin University of Technology, Tianjin 300384, China
| | - Yingchao Lin
- College of Environmental Science and Engineering, Nankai University, Tianjin 300350, China.
| | - Junjie Yue
- College of Environmental Science and Safety Engineering, Tianjin University of Technology, Tianjin 300384, China.
| |
Collapse
|
2
|
Qi X, Zhao Y, Qi Z, Hou S, Chen J. Machine Learning Empowering Drug Discovery: Applications, Opportunities and Challenges. Molecules 2024; 29:903. [PMID: 38398653 PMCID: PMC10892089 DOI: 10.3390/molecules29040903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 02/08/2024] [Accepted: 02/14/2024] [Indexed: 02/25/2024] Open
Abstract
Drug discovery plays a critical role in advancing human health by developing new medications and treatments to combat diseases. How to accelerate the pace and reduce the costs of new drug discovery has long been a key concern for the pharmaceutical industry. Fortunately, by leveraging advanced algorithms, computational power and biological big data, artificial intelligence (AI) technology, especially machine learning (ML), holds the promise of making the hunt for new drugs more efficient. Recently, the Transformer-based models that have achieved revolutionary breakthroughs in natural language processing have sparked a new era of their applications in drug discovery. Herein, we introduce the latest applications of ML in drug discovery, highlight the potential of advanced Transformer-based ML models, and discuss the future prospects and challenges in the field.
Collapse
Affiliation(s)
- Xin Qi
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou 215011, China; (Y.Z.); (S.H.); (J.C.)
| | - Yuanchun Zhao
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou 215011, China; (Y.Z.); (S.H.); (J.C.)
| | - Zhuang Qi
- School of Software, Shandong University, Jinan 250101, China;
| | - Siyu Hou
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou 215011, China; (Y.Z.); (S.H.); (J.C.)
| | - Jiajia Chen
- School of Chemistry and Life Sciences, Suzhou University of Science and Technology, Suzhou 215011, China; (Y.Z.); (S.H.); (J.C.)
| |
Collapse
|
3
|
Durojaye OA, Ejaz U, Uzoeto HO, Fadahunsi AA, Opabunmi AO, Ekpo DE, Sedzro DM, Idris MO. CSC01 shows promise as a potential inhibitor of the oncogenic G13D mutant of KRAS: an in silico approach. Amino Acids 2023; 55:1745-1764. [PMID: 37500789 DOI: 10.1007/s00726-023-03304-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 07/11/2023] [Indexed: 07/29/2023]
Abstract
About 30% of malignant tumors include KRAS mutations, which are frequently required for the development and maintenance of malignancies. KRAS is now a top-priority cancer target as a result. After years of research, it is now understood that the oncogenic KRAS-G12C can be targeted. However, many other forms, such as the G13D mutant, are yet to be addressed. Here, we used a receptor-based pharmacophore modeling technique to generate potential inhibitors of the KRAS-G13D oncogenic mutant. Using a comprehensive virtual screening workflow model, top hits were selected, out of which CSC01 was identified as a promising inhibitor of the oncogenic KRAS mutant (G13D). The stability of CSC01 upon binding the switch II pocket was evaluated through an exhaustive molecular dynamics simulation study. The several post-simulation analyses conducted suggest that CSC01 formed a stable complex with KRAS-G13D. CSC01, through a dynamic protein-ligand interaction profiling analysis, was also shown to maintain strong interactions with the mutated aspartic acid residue throughout the simulation. Although binding free energy analysis through the umbrella sampling approach suggested that the affinity of CSC01 with the switch II pocket of KRAS-G13D is moderate, our DFT analysis showed that the stable interaction of the compound might be facilitated by the existence of favorable molecular electrostatic potentials. Furthermore, based on ADMET predictions, CSC01 demonstrated a satisfactory drug likeness and toxicity profile, making it an exemplary candidate for consideration as a potential KRAS-G13D inhibitor.
Collapse
Affiliation(s)
- Olanrewaju Ayodeji Durojaye
- MOE Key Laboratory of Membraneless Organelle and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, University of Science and Technology of China, Hefei, 230027, Anhui, China.
- School of Life Sciences, University of Science and Technology of China, Hefei, 230027, Anhui, China.
- Department of Chemical Sciences, Coal City University, Emene, EnuguState, Nigeria.
| | - Umer Ejaz
- MOE Key Laboratory of Membraneless Organelle and Cellular Dynamics, Hefei National Laboratory for Physical Sciences at the Microscale, University of Science and Technology of China, Hefei, 230027, Anhui, China
- School of Life Sciences, University of Science and Technology of China, Hefei, 230027, Anhui, China
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, 210009, China
| | - Henrietta Onyinye Uzoeto
- Federal College of Dental Technology, Trans-Ekulu, Enugu State, Nigeria
- Department of Biological Sciences, Coal City University, Emene, Enugu State, Nigeria
| | - Adeola Abraham Fadahunsi
- Graduate School of Biomedical Science and Engineering, University of Maine, Orono, ME, 04469, USA
| | - Adebayo Oluwole Opabunmi
- RNA Medical Center, International Institutes of Medicine, Zhejiang University, Hangzhou, China
- Zhejiang University-University of Edinburgh Institute, Zhejiang University, Hangzhou, China
- The Fourth Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Daniel Emmanuel Ekpo
- Institute of Biological Science and Technology, National Engineering Research Center for Non-Food Biorefinery, Guangxi Academy of Sciences, Nanning, 530007, China
- Department of Biochemistry, Faculty of Biological Sciences, University of Nigeria, 410001, Nsukka, Enugu State, Nigeria
| | - Divine Mensah Sedzro
- Wisconsin National Primate Research Center, University of Wisconsin Graduate School, 1220 Capitol Court, Madison, 53715, WI, USA.
| | - Mukhtar Oluwaseun Idris
- School of Life Sciences, University of Science and Technology of China, Hefei, 230027, Anhui, China.
| |
Collapse
|
4
|
Wang H, Liu W, Chen J, Wang Z. Applicability Domains Based on Molecular Graph Contrastive Learning Enable Graph Attention Network Models to Accurately Predict 15 Environmental End Points. Environ Sci Technol 2023; 57:16906-16917. [PMID: 37897806 DOI: 10.1021/acs.est.3c03860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/30/2023]
Abstract
In silico models for predicting physicochemical properties and environmental fate parameters are necessary for the sound management of chemicals. This study employed graph attention network (GAT) algorithms to construct such models on 15 end points. The results showed that the GAT models outperformed the previous state-of-the-art models, and their performance was not influenced by the presence or absence of compounds with certain structures. Molecular similarity density (ρs) was found to be a key metrics characterizing data set modelability, in addition to the proportion of compounds at activity cliffs. By introducing molecular graph (MG) contrastive learning, MG-based ρs and molecular inconsistency in activities (IA) were calculated and employed for characterizing the structure-activity landscape (SAL)-based applicability domain ADSAL{ρs, IA}. The GAT models coupled with ADSAL{ρs, IA} significantly improved the prediction coefficient of determination (R2) on all the end points by an average of 14.4% and enabled all the end points to have R2 > 0.9, which could hardly be achieved previously. The models were employed to screen persistent, mobile, and/or bioaccumulative chemicals from inventories consisting of about 106 chemicals. Given the current state-of-the-art model performance and coverage of the various environmental end points, the constructed models with ADSAL{ρs, IA} may serve as benchmarks for future efforts to improve modeling efficacy.
Collapse
Affiliation(s)
- Haobo Wang
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Wenjia Liu
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Jingwen Chen
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Zhongyu Wang
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| |
Collapse
|
5
|
Zhang Y, Xie L, Zhang D, Xu X, Xu L. Application of Machine Learning Methods to Predict the Air Half-Lives of Persistent Organic Pollutants. Molecules 2023; 28:7457. [PMID: 38005179 PMCID: PMC10673120 DOI: 10.3390/molecules28227457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Revised: 11/01/2023] [Accepted: 11/02/2023] [Indexed: 11/26/2023] Open
Abstract
Persistent organic pollutants (POPs) are ubiquitous and bioaccumulative, posing potential and long-term threats to human health and the ecological environment. Quantitative structure-activity relationship (QSAR) studies play a guiding role in analyzing the toxicity and environmental fate of different organic pollutants. In the current work, five molecular descriptors are utilized to construct QSAR models for predicting the mean and maximum air half-lives of POPs, including specifically the energy of the highest occupied molecular orbital (HOMO_Energy_DMol3), a component of the dipole moment along the z-axis (Dipole_Z), fragment contribution to SAscore (SAscore_Fragments), subgraph counts (SC_3_P), and structural information content (SIC). The QSAR models were achieved through the application of three machine learning methods: partial least squares (PLS), multiple linear regression (MLR), and genetic function approximation (GFA). The determination coefficients (R2) and relative errors (RE) for the mean air half-life of each model are 0.916 and 3.489% (PLS), 0.939 and 5.048% (MLR), 0.938 and 5.131% (GFA), respectively. Similarly, the determination coefficients (R2) and RE for the maximum air half-life of each model are 0.915 and 5.629% (PLS), 0.940 and 10.090% (MLR), 0.939 and 11.172% (GFA), respectively. Furthermore, the mechanisms that elucidate the significant factors impacting the air half-lives of POPs have been explored. The three regression models show good predictive and extrapolation abilities for POPs within the application domain.
Collapse
Affiliation(s)
| | | | | | - Xiaojun Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China; (Y.Z.); (D.Z.)
| | - Lei Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China; (Y.Z.); (D.Z.)
| |
Collapse
|
6
|
George G, Yadav N, Auti PS, Paul AT. Molecular modelling, synthesis and in vitro evaluation of quinazolinone hybrid analogues as potential pancreatic lipase inhibitors. J Biomol Struct Dyn 2023; 41:9583-9601. [PMID: 36350239 DOI: 10.1080/07391102.2022.2144456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 10/30/2022] [Indexed: 11/11/2022]
Abstract
Obesity is a multifactorial metabolic disorder, growing in an alarming rate across the world. Amongst the numerous targets explored for obesity management, inhibition of pancreatic lipase (PL) is considered as one of the promising approaches. Orlistat is the only PL inhibitory drug approved for long term treatment of obesity. However, it is reported to possess hepatotoxicity and nephrotoxicity. Thus, novel drug candidates that act through PL inhibition are considered the hour's need. Based on this aim, a series of quinazolinone hybrid analogues have been synthesized, characterized and evaluated for their PL inhibitory potential. The physicochemical properties and toxicity parameters suggested that these parameters are in an acceptable range for the screened analogues. Amongst the synthesised analogues, QH-25 exerted potential PL inhibition (IC50 = 16.99 ± 0.54 µM). Further, enzyme inhibition studies suggested a reversible competitive inhibition. Molecular docking of these analogues was in line with in vitro results, wherein the obtained MolDock scores exhibited a significant correlation with their inhibitory activity (Pearson's r = 0.6629). To further confirm the stability of the QH-25-PL complex in a dynamic environment, a molecular dynamics study (100 ns) was carried out and the results suggested that this complex is stable at dynamic conditions. Overall, these results shed light on the quinazolinone hybrids as potential PL inhibitors. Further structural modification may result in the development of potent anti-obesity agents which acts through PL inhibition.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Ginson George
- Laboratory of Natural Product Chemistry, Department of Pharmacy, Birla Institute of Technology and Science (BITS-Pilani), Pilani campus, Pilani, Rajasthan, India
| | - Nisha Yadav
- Laboratory of Natural Product Chemistry, Department of Pharmacy, Birla Institute of Technology and Science (BITS-Pilani), Pilani campus, Pilani, Rajasthan, India
| | - Prashant S Auti
- Laboratory of Natural Product Chemistry, Department of Pharmacy, Birla Institute of Technology and Science (BITS-Pilani), Pilani campus, Pilani, Rajasthan, India
| | - Atish Tulshiram Paul
- Laboratory of Natural Product Chemistry, Department of Pharmacy, Birla Institute of Technology and Science (BITS-Pilani), Pilani campus, Pilani, Rajasthan, India
| |
Collapse
|
7
|
Su L, Wang Z, Wang Y, Xiao Z, Xia D, Zhang S, Chen J. Predicting adsorption of organic compounds onto graphene and black phosphorus by molecular dynamics and machine learning. Environ Sci Pollut Res Int 2023; 30:108846-108854. [PMID: 37759049 DOI: 10.1007/s11356-023-29962-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 09/13/2023] [Indexed: 09/29/2023]
Abstract
With an increase in production and application of various engineering nanomaterials (ENMs), they will inevitably be released into the environment. Adsorption of various organic chemicals onto ENMs will impact on their environmental behavior and toxicology. It is unrealistic to experimentally determine adsorption equilibrium constants (K) for the vast number of organics and ENMs due to high cost in expenditure and time. Herein, appropriate molecular dynamics (MD) methods were evaluated and selected by comparing experimental K values of seven organics adsorbed onto graphene with the MD-calculated ones. Machine learning (ML) models on K of organics adsorption onto graphene and black phosphorus nanomaterials were constructed based on a benchmark data set from the MD simulations. Lasso models based on Mordred descriptors outperformed ML models built by support vector machine, random forest, k-nearest neighbor, and gradient boosting decision tree, in terms of cross-validation coefficients (Q2 > 0.90). The Lasso models also outperformed conventional poly-parameter linear free energy relationship models for predicting logK. Compared with previous models, the Lasso models considered more compounds with different functional groups and thus have broader applicability domains. This study provides a promising way to fill the data gap in logK for chemicals adsorbed onto the ENMs.
Collapse
Affiliation(s)
- Lihao Su
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian, 116024, China
| | - Zhongyu Wang
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian, 116024, China
| | - Ya Wang
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian, 116024, China
| | - Zijun Xiao
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian, 116024, China
| | - Deming Xia
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian, 116024, China
| | - Siyu Zhang
- Key Laboratory of Pollution Ecology and Environmental Engineering, Institute of Applied Ecology, Chinese Academy of Sciences, Shenyang, 110016, China
| | - Jingwen Chen
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian, 116024, China.
| |
Collapse
|
8
|
Wu J, Su Y, Yang A, Ren J, Xiang Y. An improved multi-modal representation-learning model based on fusion networks for property prediction in drug discovery. Comput Biol Med 2023; 165:107452. [PMID: 37690287 DOI: 10.1016/j.compbiomed.2023.107452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2023] [Revised: 08/12/2023] [Accepted: 09/04/2023] [Indexed: 09/12/2023]
Abstract
Accurate characterization of molecular representations plays an important role in the property prediction based on deep learning (DL) for drug discovery. However, most previous researches considered only one type of molecular representations, resulting in that it difficult to capture the full molecular feature information. In this study, a novel DL framework called multi-modal molecular representation learning fusion network (MMRLFN) is developed, which could simultaneously learn and integrate drug molecular features from molecular graphs and SMILES sequences. The developed MMRLFN method is composed of three complementary deep neural networks to learn various features from different molecular representations, such as molecular topology, local chemical background information, and substructures at varying scales. Eight public datasets involving various molecular properties used in drug discovery were employed to train and evaluate the developed MMRLFN. The obtained models showed better performances than the existing models based on mono-modal molecular representations. Additionally, a thorough analysis of the noise resistance and interpretability of the MMRLFN has been carried out. The generalization ability and effectiveness of the MMRLFN has been verified by case studies as well. Overall, the MMRLFN can accurately predict molecular properties and provide potentially valuable information from large datasets, thereby maximizing the possibility of successful drug discovery.
Collapse
Affiliation(s)
- Jinzhou Wu
- School of Intelligent Technology and Engineering, Chongqing University of Science and Technology, Chongqing, 401331, China
| | - Yang Su
- School of Intelligent Technology and Engineering, Chongqing University of Science and Technology, Chongqing, 401331, China.
| | - Ao Yang
- School of Safety Engineering (School of Emergency Management), Chongqing University of Science and Technology, Chongqing, 401331, China
| | - Jingzheng Ren
- Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, China
| | - Yi Xiang
- School of Intelligent Technology and Engineering, Chongqing University of Science and Technology, Chongqing, 401331, China
| |
Collapse
|
9
|
Qureshi R, Irfan M, Gondal TM, Khan S, Wu J, Hadi MU, Heymach J, Le X, Yan H, Alam T. AI in drug discovery and its clinical relevance. Heliyon 2023; 9:e17575. [PMID: 37396052 PMCID: PMC10302550 DOI: 10.1016/j.heliyon.2023.e17575] [Citation(s) in RCA: 22] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Revised: 06/17/2023] [Accepted: 06/21/2023] [Indexed: 07/04/2023] Open
Abstract
The COVID-19 pandemic has emphasized the need for novel drug discovery process. However, the journey from conceptualizing a drug to its eventual implementation in clinical settings is a long, complex, and expensive process, with many potential points of failure. Over the past decade, a vast growth in medical information has coincided with advances in computational hardware (cloud computing, GPUs, and TPUs) and the rise of deep learning. Medical data generated from large molecular screening profiles, personal health or pathology records, and public health organizations could benefit from analysis by Artificial Intelligence (AI) approaches to speed up and prevent failures in the drug discovery pipeline. We present applications of AI at various stages of drug discovery pipelines, including the inherently computational approaches of de novo design and prediction of a drug's likely properties. Open-source databases and AI-based software tools that facilitate drug design are discussed along with their associated problems of molecule representation, data collection, complexity, labeling, and disparities among labels. How contemporary AI methods, such as graph neural networks, reinforcement learning, and generated models, along with structure-based methods, (i.e., molecular dynamics simulations and molecular docking) can contribute to drug discovery applications and analysis of drug responses is also explored. Finally, recent developments and investments in AI-based start-up companies for biotechnology, drug design and their current progress, hopes and promotions are discussed in this article.
Collapse
Affiliation(s)
- Rizwan Qureshi
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
- Department of Imaging Physics, MD Anderson Cancer Center, The University of Texas, Houston, USA
| | - Muhammad Irfan
- Faculty of Electrical Engineering, Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Swabi, Pakistan
| | | | - Sheheryar Khan
- School of Professional Education & Executive Development, The Hong Kong Polytechnic University, Hong Kong
| | - Jia Wu
- Department of Imaging Physics, MD Anderson Cancer Center, The University of Texas, Houston, USA
| | | | - John Heymach
- Department of Thoracic Head and Neck Medical Oncology, Division of Cancer Medicine, The University of Texas, MD Anderson Cancer Center, Houston, USA
| | - Xiuning Le
- Department of Thoracic Head and Neck Medical Oncology, Division of Cancer Medicine, The University of Texas, MD Anderson Cancer Center, Houston, USA
| | - Hong Yan
- Department of Electrical Engineering, City University of Hong Kong, Kowloon, Hong Kong
| | - Tanvir Alam
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| |
Collapse
|
10
|
Srivathsa AV, Sadashivappa NM, Hegde AK, Radha S, Mahesh AR, Ammunje DN, Sen D, Theivendren P, Govindaraj S, Kunjiappan S, Pavadai P. A Review on Artificial Intelligence Approaches and Rational Approaches in Drug Discovery. Curr Pharm Des 2023; 29:1180-1192. [PMID: 37132148 DOI: 10.2174/1381612829666230428110542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 02/06/2023] [Accepted: 02/27/2023] [Indexed: 05/04/2023]
Abstract
Artificial intelligence (AI) speeds up the drug development process and reduces its time, as well as the cost which is of enormous importance in outbreaks such as COVID-19. It uses a set of machine learning algorithms that collects the available data from resources, categorises, processes and develops novel learning methodologies. Virtual screening is a successful application of AI, which is used in screening huge drug-like databases and filtering to a small number of compounds. The brain's thinking of AI is its neural networking which uses techniques such as Convoluted Neural Network (CNN), Recursive Neural Network (RNN) or Generative Adversial Neural Network (GANN). The application ranges from small molecule drug discovery to the development of vaccines. In the present review article, we discussed various techniques of drug design, structure and ligand-based, pharmacokinetics and toxicity prediction using AI. The rapid phase of discovery is the need of the hour and AI is a targeted approach to achieve this.
Collapse
Affiliation(s)
- Anjana Vidya Srivathsa
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, M.S.R. Nagar, Bengaluru, 560054, India
| | - Nandini Markuli Sadashivappa
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, M.S.R. Nagar, Bengaluru, 560054, India
| | - Apeksha Krishnamurthy Hegde
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, M.S.R. Nagar, Bengaluru, 560054, India
| | - Srimathi Radha
- Department of Pharmaceutical Chemistry, SRM College of Pharmacy, Faculty of Medicine and Health Sciences, SRM Institute of Science and Technology, Chengalpattu District, Kattankulathur, Tamil Nadu, 603203, India
| | - Agasa Ramu Mahesh
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, M.S.R. Nagar, Bengaluru, 560054, India
| | - Damodar Nayak Ammunje
- Department of Pharmacology, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, M.S.R. Nagar, Bengaluru, 560054, India
| | - Debanjan Sen
- Department of Pharmaceutical Chemistry, BCDA College of Pharmacy & Technology, Hridaypur, Kolkata, 700127, West Bengal, India
| | - Panneerselvam Theivendren
- Department of Pharmaceutical Chemistry, Swamy Vivekanandha College of Pharmacy, Elayampalayam, Tiruchengode, 637205, India
| | - Saravanan Govindaraj
- Department of Pharmaceutical Chemistry, MNR College of Pharmacy, Fasalwadi, Sangareddy, 502 001, India
| | - Selvaraj Kunjiappan
- Department of Biotechnology, Kalasalingam Academy of Research and Education, Krishnankoil, 626126, India
| | - Parasuraman Pavadai
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, M.S.R. Nagar, Bengaluru, 560054, India
| |
Collapse
|
11
|
Magurany KA, Chang X, Clewell R, Coecke S, Haugabrooks E, Marty S. A Pragmatic Framework for the Application of New Approach Methodologies in One Health Toxicological Risk Assessment. Toxicol Sci 2023; 192:kfad012. [PMID: 36782355 PMCID: PMC10109535 DOI: 10.1093/toxsci/kfad012] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2023] Open
Abstract
Globally, industries and regulatory authorities are faced with an urgent need to assess the potential adverse effects of chemicals more efficiently by embracing new approach methodologies (NAMs). NAMs include cell and tissue methods (in vitro), structure-based/toxicokinetic models (in silico), methods that assess toxicant interactions with biological macromolecules (in chemico), and alternative models. Increasing knowledge on chemical toxicokinetics (what the body does with chemicals) and toxicodynamics (what the chemicals do with the body) obtained from in silico and in vitro systems continues to provide opportunities for modernizing chemical risk assessments. However, directly leveraging in vitro and in silico data for derivation of human health-based reference values has not received regulatory acceptance due to uncertainties in extrapolating NAM results to human populations, including metabolism, complex biological pathways, multiple exposures, interindividual susceptibility and vulnerable populations. The objective of this article is to provide a standardized pragmatic framework that applies integrated approaches with a focus on quantitative in vitro to in vivo extrapolation (QIVIVE) to extrapolate in vitro cellular exposures to human equivalent doses from which human reference values can be derived. The proposed framework intends to systematically account for the complexities in extrapolation and data interpretation to support sound human health safety decisions in diverse industrial sectors (food systems, cosmetics, industrial chemicals, pharmaceuticals etc.). Case studies of chemical entities, using new and existing data, are presented to demonstrate the utility of the proposed framework while highlighting potential sources of human population bias and uncertainty, and the importance of Good Method and Reporting Practices.
Collapse
Affiliation(s)
| | | | - Rebecca Clewell
- 21st Century Tox Consulting, Chapel Hill, North Carolina 27517, USA
| | - Sandra Coecke
- European Commission Joint Research Centre, Ispra, Italy
| | - Esther Haugabrooks
- Coca-Cola Company (formerly Physicians Committee for Responsible Medicine), Atlanta, Georgia 30313, USA
| | - Sue Marty
- The Dow Chemical Company, Midland, Michigan 48667, USA
| |
Collapse
|
12
|
Iftakher A, Monjur MS, Hasan MMF. An Overview of Computer‐aided Molecular and Process Design. CHEM-ING-TECH 2023. [DOI: 10.1002/cite.202200172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Affiliation(s)
- Ashfaq Iftakher
- Texas A&M University Artie McFerrin Department of Chemical Engineering 100 Spence St. TX 77843-3122 College Station USA
| | - Mohammed Sadaf Monjur
- Texas A&M University Artie McFerrin Department of Chemical Engineering 100 Spence St. TX 77843-3122 College Station USA
| | - M. M. Faruque Hasan
- Texas A&M University Artie McFerrin Department of Chemical Engineering 100 Spence St. TX 77843-3122 College Station USA
| |
Collapse
|
13
|
Sun L, Zhang M, Xie L, Gao Q, Xu X, Xu L. In silico prediction of boiling point, octanol-water partition coefficient, and retention time index of polycyclic aromatic hydrocarbons through machine learning. Chem Biol Drug Des 2023; 101:52-68. [PMID: 35852446 DOI: 10.1111/cbdd.14121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 07/14/2022] [Accepted: 07/17/2022] [Indexed: 12/15/2022]
Abstract
Polycyclic aromatic hydrocarbons (PAHs), a special class of persistent organic pollutants (POPs) with two or more aromatic rings, have received extensive attention owing to their carcinogenic, mutagenic, and teratogenic effects. Quantitative structure-property relationship (QSPR) is powerful chemometric method to correlate structural descriptors of PAHs with their physicochemical properties. In this manuscript, a QSPR study of PAHs was performed to predict their boiling point (bp), octanol-water partition coefficient (LogKow ), and retention time index (RI). In addition to traditional molecular descriptors, structural fingerprints play an important role in the correlation of the above properties. Three regression methods, partial least squares (PLS), multiple linear regression (MLR), and genetic function approximation (GFA), were used to establish QSPR models for each property of PAHs. The correlation coefficient (R2 test ) and root mean square error (RMSE) of best model were 0.980 and 24.39% (PLS), 0.979 and 35.80% (GFA), 0.926 and 22.90% (MLR) for bp, LogKow, and RI, respectively. The model proposed here can be used to estimate physicochemical properties and inform toxicity prediction of environmental chemicals.
Collapse
Affiliation(s)
- Linkang Sun
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Min Zhang
- School of Computer Engineering, Jiangsu University of Technology, Changzhou, China
| | - Liangxu Xie
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Qian Gao
- School of Computer Engineering, Jiangsu University of Technology, Changzhou, China
| | - Xiaojun Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| | - Lei Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou, China
| |
Collapse
|
14
|
Devi V, Awasthi P. Juvenile hormone mimics with phenyl ether and amide functionality to be insect growth regulators (IGRs): synthesis, characterization, computational and biological study. J Biomol Struct Dyn 2022; 40:13246-13264. [PMID: 34622740 DOI: 10.1080/07391102.2021.1985614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
A series of substituted phenyl ethers derivatives as juvenile hormone (JH) mimics (V1-V8) have been synthesized. Substituted phenoxyacetic acid and amino acid ethyl ester hydrochloride were prepared using NaOH, SOCl2. DCC method has been used for amide linkage. The structure of prepared compounds has been confirmed by Fourier Transform Infra-Red (FT-IR), Electrospray ionization-Mass spectrometry (ESI-MS), Proton and Carbon-13 nuclear magnetic resonance (1H-NMR, 13C-NMR) spectroscopic techniques. Biological efficacy of synthesized analogs has been carried out under laboratory conditions. Galleria mellonella (honey bee pest) has been chosen as testing insect. Juvenile hormone (JH) activity of synthesized compounds has been tested at different concentrations and compared with the standard juvenile hormone analogs (JHAs) pyriproxyfen (M1) and fenoxycarb (M2) against the fifth larval instar of G. mellonella. Compound ethyl 2-[2-(4-methylphenoxy)aminoacetyl]-3-phenyl-propanoate (V6) exhibited better activity among all the synthesized compounds (V1-V8) with LC50 and LC90 values of 0.11 mg/mL and 0.56 mg/mL respectively. Compounds showed insect growth regulating (IGR) activity at lower concentrations. In silico screening of all synthesized compounds with the W-cavity of juvenile hormone-binding protein (JHBP) of insect G. mellonella has been carried out. Chemical reactivity of synthesized series has been studied using DFT/B3LYP/6-311 + G(d,2p) method. Non-toxic behavior of molecules has also been observed from ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) study using discovery studio client 3.0.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Vandna Devi
- Department of Chemistry, National Institute of Technology, Hamirpur, Himachal Pradesh, India
| | - Pamita Awasthi
- Department of Chemistry, National Institute of Technology, Hamirpur, Himachal Pradesh, India
| |
Collapse
|
15
|
Wu Q, Cao S, Chen Z, Wei X, Ma G, Yu H. Predictive Models of Gas/Particulate Partition Coefficients ( KP) for Polycyclic Aromatic Hydrocarbons and Their Oxygen/Nitrogen Derivatives. Molecules 2022; 27:molecules27217608. [PMID: 36364435 PMCID: PMC9657024 DOI: 10.3390/molecules27217608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 11/03/2022] [Accepted: 11/03/2022] [Indexed: 11/09/2022] Open
Abstract
Polycyclic aromatic hydrocarbons (PAHs) and their oxygen/nitrogen derivatives released into the atmosphere can alternate between a gas phase and a particulate phase, further affecting their environmental behavior and fate. The gas/particulate partition coefficient (KP) is generally used to characterize such partitioning equilibrium. In this study, the correlation between log KP of fifty PAH derivatives and their n-octanol/air partition coefficient (log KOA) was first analyzed, yielding a strong linear correlation (R2 = 0.801). Then, Gaussian 09 software was used to calculate quantum chemical descriptors of all chemicals at M062X/6-311+G (d,p) level. Both stepwise multiple linear regression (MLR) and support vector machine (SVM) methods were used to develop the quantitative structure-property relationship (QSPR) prediction models of log KP. They yield better statistical performance (R2 > 0.847, RMSE < 0.584) than the log KOA model. Simulation external validation and cross validation were further used to characterize the fitting performance, predictive ability, and robustness of the models. The mechanism analysis shows intermolecular dispersion interaction and hydrogen bonding as the main factors to dominate the distribution of PAH derivatives between the gas phase and particulate phase. The developed models can be used to predict log KP values of other PAH derivatives in the application domain, providing basic data for their ecological risk assessment.
Collapse
|
16
|
Pantelidis P, Spartalis M, Zakynthinos G, Anastasiou A, Goliopoulou A, Oikonomou E, Iliopoulos DC, Siasos G. Artificial Intelligence: The new "fuel" to accelerate pharmaceutical development. Curr Pharm Des 2022; 28:2127-2128. [PMID: 35909280 DOI: 10.2174/1381612828666220729101103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 06/27/2022] [Indexed: 12/07/2022]
Affiliation(s)
- Panteleimon Pantelidis
- 3rd Department of Cardiology, Sotiria Thoracic Diseases General Hospital, National and Kapodistrian University of Athens, Athens, Greece
| | - Michael Spartalis
- 3rd Department of Cardiology, Sotiria Thoracic Diseases General Hospital, National and Kapodistrian University of Athens, Athens, Greece
| | - George Zakynthinos
- 3rd Department of Cardiology, Sotiria Thoracic Diseases General Hospital, National and Kapodistrian University of Athens, Athens, Greece
| | - Artemis Anastasiou
- 3rd Department of Cardiology, Sotiria Thoracic Diseases General Hospital, National and Kapodistrian University of Athens, Athens, Greece
| | - Athina Goliopoulou
- 3rd Department of Cardiology, Sotiria Thoracic Diseases General Hospital, National and Kapodistrian University of Athens, Athens, Greece
| | - Evangelos Oikonomou
- 3rd Department of Cardiology, Sotiria Thoracic Diseases General Hospital, National and Kapodistrian University of Athens, Athens, Greece
| | - Dimitrios C Iliopoulos
- Laboratory of Experimental Surgery and Surgical Research 'N. S. Christeas', National and Kapodistrian University of Athens, Medical School, Athens, Greece
| | - Gerasimos Siasos
- 3rd Department of Cardiology, Sotiria Thoracic Diseases General Hospital, National and Kapodistrian University of Athens, Athens, Greece
| |
Collapse
|
17
|
Jiang Z, Hu J, Samia A, Yu X(. Predicting Active Sites in Photocatalytic Degradation Process Using an Interpretable Molecular-Image Combined Convolutional Neural Network. Catalysts 2022; 12:746. [DOI: 10.3390/catal12070746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/10/2022] Open
Abstract
Machine-learning models have great potential to accelerate the design and performance assessment of photocatalysts, leveraging their unique advantages in detecting patterns and making predictions based on data. However, most machine-learning models are “black-box” models due to lack of interpretability. This paper describes the development of an interpretable neural-network model on the performance of photocatalytic degradation of organic contaminants by TiO2. The molecular structures of the organic contaminants are represented by molecular images, which are subsequently encoded by feeding into a special convolutional neural network (CNN), EfficientNet, to extract the critical structural features. The extracted features in addition to five other experimental variables were input to a neural network that was subsequently trained to predict the photodegradation reaction rates of the organic contaminants by TiO2. The results show that this machine-learning (ML) model attains a higher accuracy to predict the photocatalytic degradation rate of organic contaminants than a previously developed machine-learning model that used molecular fingerprint encoding. In addition, the most relevant regions in the molecular image affecting the photocatalytic rates can be extracted with gradient-weighted class activation mapping (Grad-CAM). This interpretable machine-learning model, leveraging the graphic interpretability of CNN model, allows us to highlight regions of the molecular structure serving as the active sites of water contaminants during the photocatalytic degradation process. This provides an important piece of information to understand the influence of molecular structures on the photocatalytic degradation process.
Collapse
|
18
|
Dobbelaere MR, Ureel Y, Vermeire FH, Tomme L, Stevens CV, Van Geem KM. Machine Learning for Physicochemical Property Prediction of Complex Hydrocarbon Mixtures. Ind Eng Chem Res 2022. [DOI: 10.1021/acs.iecr.2c00442] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Maarten R. Dobbelaere
- Laboratory for Chemical Technology, Department of Materials, Textiles and Chemical Engineering, Ghent University, Technologiepark 125, 9052 Gent, Belgium
| | - Yannick Ureel
- Laboratory for Chemical Technology, Department of Materials, Textiles and Chemical Engineering, Ghent University, Technologiepark 125, 9052 Gent, Belgium
| | - Florence H. Vermeire
- Laboratory for Chemical Technology, Department of Materials, Textiles and Chemical Engineering, Ghent University, Technologiepark 125, 9052 Gent, Belgium
| | - Lowie Tomme
- Laboratory for Chemical Technology, Department of Materials, Textiles and Chemical Engineering, Ghent University, Technologiepark 125, 9052 Gent, Belgium
| | - Christian V. Stevens
- SynBioC Research Group, Department of Green Chemistry and Technology, Faculty of Bioscience Engineering, Ghent University, Coupure Links 653, 9000 Gent, Belgium
| | - Kevin M. Van Geem
- Laboratory for Chemical Technology, Department of Materials, Textiles and Chemical Engineering, Ghent University, Technologiepark 125, 9052 Gent, Belgium
| |
Collapse
|
19
|
Abstract
This work showcases the remarkable ability of sigma profiles to function as molecular descriptors in deep learning. The sigma profiles of 1432 compounds are used to train convolutional neural networks that accurately correlate and predict a wide range of physicochemical properties. The architectures developed are then exploited to include temperature as an additional feature.
Collapse
Affiliation(s)
- Dinis O Abranches
- Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, Indiana 46556, USA.
| | - Yong Zhang
- Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, Indiana 46556, USA.
| | - Edward J Maginn
- Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, Indiana 46556, USA.
| | - Yamil J Colón
- Department of Chemical and Biomolecular Engineering, University of Notre Dame, Notre Dame, Indiana 46556, USA.
| |
Collapse
|
20
|
Devi V, Kumari N, Awasthi P. Synthesis, characterization and insect growth regulating study of beta-alanine substituted sulfonamide derivatives as juvenile hormone mimics. PHOSPHORUS SULFUR 2022. [DOI: 10.1080/10426507.2022.2061970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Vandna Devi
- Department of Chemistry, National Institute of Technology, Hamirpur, India
| | - Neetika Kumari
- Department of Chemistry, National Institute of Technology, Hamirpur, India
| | - Pamita Awasthi
- Department of Chemistry, National Institute of Technology, Hamirpur, India
| |
Collapse
|
21
|
Hoffmann S, Alépée N, Gilmour N, Kern PS, van Vliet E, Boislève F, Bury D, Cloudet E, Klaric M, Kühnl J, Lalko JF, Mewes K, Miyazawa M, Nishida H, Tam Brami MT, Varçin M, Api AM, Europe C. Expansion of the Cosmetics Europe skin sensitisation database with new substances and PPRA data. Regul Toxicol Pharmacol 2022;:105169. [PMID: 35447229 DOI: 10.1016/j.yrtph.2022.105169] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Accepted: 04/13/2022] [Indexed: 11/21/2022]
Abstract
The assessment of skin sensitisation is a key requirement in all regulated sectors, with the European Union's regulation of cosmetic ingredients being most challenging, since it requires quantitative skin sensitisation assessment based on new approach methodologies (NAMs). To address this challenge, an in-depth and harmonised understanding of NAMs is fundamental to inform the assessment. Therefore, we compiled a database of NAMs, and in vivo (human and local lymph node assay) reference data. Here, we expanded this database with 41 substances highly relevant for cosmetic industry. These structurally different substances were tested in six NAMs (Direct Peptide Reactivity Assay, KeratinoSens™, human Cell Line Activation Test, U-SENS™, SENS-IS, Peroxidase Peptide Reactivity Assay). Our analysis revealed that the substances could be tested without technical limitations, but were generally overpredicted when compared to reference results. Reasons for this reduced predictivity were explored through pairwise NAM comparisons and association of overprediction with hydrophobicity. We conclude that more detailed understanding of how NAMs apply to a wider range of substances is needed. This would support a flexible and informed choice of NAMs to be optimally applied in the context of a next generation risk assessment framework, ultimately contributing to the characterisation and reduction of uncertainty.
Collapse
|
22
|
Zhang D, Xia S, Zhang Y. Accurate Prediction of Aqueous Free Solvation Energies Using 3D Atomic Feature-Based Graph Neural Network with Transfer Learning. J Chem Inf Model 2022; 62:1840-1848. [PMID: 35422122 PMCID: PMC9038704 DOI: 10.1021/acs.jcim.2c00260] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Graph neural network (GNN)-based deep learning (DL) models have been widely implemented to predict the experimental aqueous solvation free energy, while its prediction accuracy has reached a plateau partly due to the scarcity of available experimental data. In order to tackle this challenge, we first build a large and diverse calculated data set Frag20-Aqsol-100K of aqueous solvation free energy with reasonable computational cost and accuracy via electronic structure calculations with continuum solvent models. Then, we develop a novel 3D atomic feature-based GNN model with the principal neighborhood aggregation (PNAConv) and demonstrate that 3D atomic features obtained from molecular mechanics-optimized geometries can significantly improve the learning power of GNN models in predicting calculated solvation free energies. Finally, we employ a transfer learning strategy by pre-training our DL model on Frag20-Aqsol-100K and fine-tuning it on the small experimental data set, and the fine-tuned model A3D-PNAConv-FT achieves the state-of-the-art prediction on the FreeSolv data set with a root-mean-squared error of 0.719 kcal/mol and a mean-absolute error of 0.417 kcal/mol using random data splits. These results indicate that integrating molecular modeling and DL would be a promising strategy to develop robust prediction models in molecular science. The source code and data are accessible at: https://yzhang.hpc.nyu.edu/IMA.
Collapse
Affiliation(s)
- Dongdong Zhang
- Department of Chemistry, New York University, New York, New York 10003, United States
| | - Song Xia
- Department of Chemistry, New York University, New York, New York 10003, United States
| | - Yingkai Zhang
- Department of Chemistry, New York University, New York, New York 10003, United States
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
23
|
Jeong K, Lee JY, Woo S, Kim D, Jeon Y, Ryu TI, Hwang SR, Jeong WH. Vapor Pressure and Toxicity Prediction for Novichok Agent Candidates Using Machine Learning Model: Preparation for Unascertained Nerve Agents after Chemical Weapons Convention Schedule 1 Update. Chem Res Toxicol 2022; 35:774-781. [PMID: 35317551 DOI: 10.1021/acs.chemrestox.1c00410] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The recent terrorist attacks using Novichok agents and subsequent operations have necessitated an understanding of its physicochemical properties, such as vapor pressure and toxicity, as well as unascertained nerve agent structures. To prevent continued threats from new types of nerve agents, the organization for the prohibition of chemical weapons (OPCW) updated the chemical weapons convention (CWC) schedule 1 list. However, this information is vague and may encompass more than 10 000 possible chemical structures, which makes it almost impossible to synthesize and measure their properties and toxicity. To assist this effort, we successfully developed machine learning (ML) models to predict the vapor pressure to help with escape and removal operations. The model shows robust and high-accuracy performance with promising features for predicting vapor pressure when applied to Novichok materials and accurate predictions with reasonable errors. The ML classification model was successfully built for the swallow globally harmonized system class of organophosphorus compounds (OP) for toxicity predictions. The tuned ML model was used to predict the toxicity of Novichok agents, as described in the CWC list. Although its accuracy and linearity can be improved, this ML model is expected to be a firm basis for developing more accurate models for predicting the vapor pressure and toxicity of nerve agents in the future to help handle future terror attacks with unknown nerve agents.
Collapse
Affiliation(s)
- Keunhong Jeong
- Department of Chemistry, Korea Military Academy, Seoul 01805, South Korea
| | - Jin-Young Lee
- Agency for Defense Development (ADD), P.O. Box 35, Yuseong-gu, Daejeon 34186, South Korea
| | - Seungmin Woo
- Department of Nuclear and Energy Engineering, Jeju National University, Jeju, 63243, South Korea
| | - Dongwoo Kim
- Department of Chemistry, Korea Military Academy, Seoul 01805, South Korea
| | - Yonggoon Jeon
- Department of Chemistry, Korea Military Academy, Seoul 01805, South Korea
| | - Tae In Ryu
- Accident Coordination and Training Division, National Institute of Chemical Safety (NICS), 90 Gajeongbuk-rO, Yuseong-gu, Daejeon 34114, South Korea
| | - Seung-Ryul Hwang
- Accident Coordination and Training Division, National Institute of Chemical Safety (NICS), 90 Gajeongbuk-rO, Yuseong-gu, Daejeon 34114, South Korea
| | - Woo-Hyeon Jeong
- Agency for Defense Development (ADD), P.O. Box 35, Yuseong-gu, Daejeon 34186, South Korea
| |
Collapse
|
24
|
Lane TR, Urbina F, Rank L, Gerlach J, Riabova O, Lepioshkin A, Kazakova E, Vocat A, Tkachenko V, Cole S, Makarov V, Ekins S. Machine Learning Models for Mycobacterium tuberculosisIn Vitro Activity: Prediction and Target Visualization. Mol Pharm 2022; 19:674-689. [PMID: 34964633 PMCID: PMC9121329 DOI: 10.1021/acs.molpharmaceut.1c00791] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Tuberculosis (TB) is a major global health challenge, with approximately 1.4 million deaths per year. There is still a need to develop novel treatments for patients infected with Mycobacterium tuberculosis (Mtb). There have been many large-scale phenotypic screens that have led to the identification of thousands of new compounds. Yet, there is very limited investment in TB drug discovery which points to the need for new methods to increase the efficiency of drug discovery against Mtb. We have used machine learning approaches to learn from the public Mtb data, resulting in many data sets and models with robust enrichment and hit rates leading to the discovery of new active compounds. Recently, we have curated predominantly small-molecule Mtb data and developed new machine learning classification models with 18 886 molecules at different activity cutoffs. We now describe the further validation of these Bayesian models using a library of over 1000 molecules synthesized as part of EU-funded New Medicines for TB and More Medicines for TB programs. We highlight molecular features which are enriched in these active compounds. In addition, we provide new regression and classification models that can be used for scoring compound libraries or used to design new molecules. We have also visualized these molecules in the context of known molecular targets and identified clusters in chemical property space, which may aid in future target identification efforts. Finally, we are also making these data sets publicly available, representing a significant increase to the available Mtb inhibition data in the public domain.
Collapse
Affiliation(s)
- Thomas R. Lane
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510 Raleigh, NC 27606, USA
| | - Fabio Urbina
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510 Raleigh, NC 27606, USA
| | - Laura Rank
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510 Raleigh, NC 27606, USA
| | - Jacob Gerlach
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510 Raleigh, NC 27606, USA
| | - Olga Riabova
- Research Center of Biotechnology RAS, 119071 Moscow, Russia
| | | | - Elena Kazakova
- Research Center of Biotechnology RAS, 119071 Moscow, Russia
| | - Anthony Vocat
- Global Health Institute, Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
| | - Valery Tkachenko
- Science Data Experts, 14909 Forest Landing Cir, Rockville, MD 20850
| | | | - Vadim Makarov
- Research Center of Biotechnology RAS, 119071 Moscow, Russia
| | - Sean Ekins
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510 Raleigh, NC 27606, USA
| |
Collapse
|
25
|
Qu C, Kearsley AJ, Schneider BI, Keyrouz W, Allison TC. Graph convolutional neural network applied to the prediction of normal boiling point. J Mol Graph Model 2022; 112:108149. [DOI: 10.1016/j.jmgm.2022.108149] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 01/19/2022] [Accepted: 02/02/2022] [Indexed: 11/29/2022]
|
26
|
Bozlul Karim M, Kanaya S, Altaf-Ul-Amin M. Antibacterial Activity Prediction of Plant Secondary Metabolites Based on a Combined Approach of Graph Clustering and Deep Neural Network. Mol Inform 2022; 41:e2100247. [PMID: 35014190 PMCID: PMC9400908 DOI: 10.1002/minf.202100247] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 01/09/2022] [Indexed: 11/20/2022]
Abstract
The plants produce numerous types of secondary metabolites which have pharmacological importance in drug development for different diseases. Computational methods widely use the fingerprints of the metabolites to understand different properties and similarities among metabolites and for the prediction of chemical reactions etc. In this work, we developed three different deep neural network models (DNN) to predict the antibacterial property of plant metabolites. We developed the first DNN model using the fingerprint set of metabolites as features. In the second DNN model, we searched the similarities among fingerprints using correlation and used one representative feature from each group of highly correlated fingerprints. In the third model, the fingerprints of metabolites were used to find structurally similar chemical compound clusters. Form each cluster a representative metabolite is selected and made part of the training dataset. The second model reduced the number of features where the third model achieved better classification results for test data. In both cases, we applied the simple graph clustering method to cluster the corresponding network. The correlation‐based DNN model reduced some features while retaining an almost similar performance compared to the first DNN model. The third model improves classification results for test data by capturing wider variance within training data using graph clustering method. This third model is somewhat novel approach and can be applied to build DNN models for other purposes.
Collapse
|
27
|
Fujii T, Kobune M. Prediction of partition coefficient in high-pressure carbon dioxide–water systems using machine learning. J Supercrit Fluids 2022. [DOI: 10.1016/j.supflu.2021.105421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
28
|
Reetz MT, König G. n
‐Butanol: An Ecologically and Economically Viable Extraction Solvent for Isolating Polar Products from Aqueous Solutions. European J Org Chem 2021. [DOI: 10.1002/ejoc.202100829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Affiliation(s)
- Manfred T. Reetz
- Max-Planck-Institut für Kohlenforschung Kaiser-Wilhelm-Platz 1 45470 Mülheim an der Ruhr Germany
- Tianjin Institute of Industrial Biotechnology Chinese Academy of Sciences Tianjin China
| | - Gerhard König
- Centre for Enzyme Innovation University of Portsmouth St Michael's Building Portsmouth PO1 2DT United Kingdom
| |
Collapse
|
29
|
Kang Y, Jeong B, Lim DH, Lee D, Lim KM. In silico prediction of the full United Nations Globally Harmonized System eye irritation categories of liquid chemicals by IATA-like bottom-up approach of random forest method. J Toxicol Environ Health A 2021; 84:960-972. [PMID: 34328061 DOI: 10.1080/15287394.2021.1956661] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
As an alternative to in vivo Draize rabbit eye irritation test, this study aimed to construct an in silico model to predict the complete United Nations (UN) Globally Harmonized System (GHS) for classification and labeling of chemicals for eye irritation category [eye damage (Category 1), irritating to eye (Category 2) and nonirritating (No category)] of liquid chemicals with Integrated approaches to testing and assessment (IATA)-like two-stage random forest approach. Liquid chemicals (n = 219) with 34 physicochemical descriptors and quality in vivo data were collected with no missing values. Seven machine learning algorithms (Naive Bayes, Logistic Regression, First Large Margin, Neural Net, Random Forest (RF), Gradient Boosted Tree, and Support Vector Machine) were examined for the ternary categorization of eye irritation potential at a single run through 10-fold cross-validation. RF, which performed best, was further improved by applying the 'Bottom-up approach' concept of IATA, namely, separating No category first, and discriminating Category 1 from 2, thereafter. The best performing training dataset achieved an overall accuracy of 73% and the correct prediction for Category 1, 2, and No category was 80%, 50%, and 77%, respectively for the test dataset. This prediction model was further validated with an external dataset of 28 chemicals, for which an overall accuracy of 71% was achieved.
Collapse
Affiliation(s)
- Yeonsoo Kang
- College of Pharmacy, Ewha Womans University, Seoul, Republic of Korea
| | - Boram Jeong
- Department of Statistics, Ewha Womans University, Seoul, Republic of Korea
| | | | - Donghwan Lee
- Department of Statistics, Ewha Womans University, Seoul, Republic of Korea
| | - Kyung-Min Lim
- College of Pharmacy, Ewha Womans University, Seoul, Republic of Korea
| |
Collapse
|
30
|
Deng J, Yang Z, Ojima I, Samaras D, Wang F. Artificial intelligence in drug discovery: applications and techniques. Brief Bioinform 2021; 23:6420092. [PMID: 34734228 DOI: 10.1093/bib/bbab430] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 08/02/2021] [Accepted: 09/18/2021] [Indexed: 12/23/2022] Open
Abstract
Artificial intelligence (AI) has been transforming the practice of drug discovery in the past decade. Various AI techniques have been used in many drug discovery applications, such as virtual screening and drug design. In this survey, we first give an overview on drug discovery and discuss related applications, which can be reduced to two major tasks, i.e. molecular property prediction and molecule generation. We then present common data resources, molecule representations and benchmark platforms. As a major part of the survey, AI techniques are dissected into model architectures and learning paradigms. To reflect the technical development of AI in drug discovery over the years, the surveyed works are organized chronologically. We expect that this survey provides a comprehensive review on AI in drug discovery. We also provide a GitHub repository with a collection of papers (and codes, if applicable) as a learning resource, which is regularly updated.
Collapse
Affiliation(s)
- Jianyuan Deng
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY 11790, USA
| | - Zhibo Yang
- Department of Computer Science, Stony Brook University, Stony Brook, NY 11790, USA
| | - Iwao Ojima
- Department of Chemistry, Stony Brook University, Stony Brook, NY 11790, USA
| | - Dimitris Samaras
- Department of Computer Science, Stony Brook University, Stony Brook, NY 11790, USA
| | - Fusheng Wang
- Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY 11790, USA.,Department of Computer Science, Stony Brook University, Stony Brook, NY 11790, USA
| |
Collapse
|
31
|
Pérez Santín E, Rodríguez Solana R, González García M, García Suárez MDM, Blanco Díaz GD, Cima Cabal MD, Moreno Rojas JM, López Sánchez JI. Toxicity prediction based on artificial intelligence: A multidisciplinary overview. WIREs Comput Mol Sci 2021. [DOI: 10.1002/wcms.1516] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Affiliation(s)
- Efrén Pérez Santín
- Escuela Superior de Ingeniería y Tecnología (ESIT) Universidad Internacional de La Rioja (UNIR) Logroño Spain
| | - Raquel Rodríguez Solana
- Department of Food Science and Health Andalusian Institute of Agricultural and Fisheries Research and Training (IFAPA), Alameda del Obispo Avda Córdoba, Andalucía Spain
| | - Mariano González García
- Escuela Superior de Ingeniería y Tecnología (ESIT) Universidad Internacional de La Rioja (UNIR) Logroño Spain
| | - María Del Mar García Suárez
- Escuela Superior de Ingeniería y Tecnología (ESIT) Universidad Internacional de La Rioja (UNIR) Logroño Spain
| | - Gerardo David Blanco Díaz
- Escuela Superior de Ingeniería y Tecnología (ESIT) Universidad Internacional de La Rioja (UNIR) Logroño Spain
| | - María Dolores Cima Cabal
- Escuela Superior de Ingeniería y Tecnología (ESIT) Universidad Internacional de La Rioja (UNIR) Logroño Spain
| | - José Manuel Moreno Rojas
- Department of Food Science and Health Andalusian Institute of Agricultural and Fisheries Research and Training (IFAPA), Alameda del Obispo Avda Córdoba, Andalucía Spain
| | - José Ignacio López Sánchez
- Escuela Superior de Ingeniería y Tecnología (ESIT) Universidad Internacional de La Rioja (UNIR) Logroño Spain
| |
Collapse
|
32
|
Abstract
Artificial intelligence (AI), a method of simulating the human brain in order to complete tasks in a more effective manner, has had numerous implementations in fields from manufacturing sectors to digital electronics. Despite the potential of AI, it may be obstinate to assume that the person administered society would rely solely on AI; with an example being the healthcare field. With the ever-expanding discoveries made on a regular basis regarding the growth of various diseases and its preservations, utilizing brain power may be deemed essential, but that doesn’t leave AI as a redundant asset. With the years of accumulated data regarding patterns and the analysis of various medical circumstances, algorithms can be formed, which could further assist in situations such as diagnosis support and population health management. This matter becomes even more relevant in today’s society with the currently ongoing COVID-19 pandemic by SARS-CoV-2. With the uncertainty of this pandemic from strain variants to the rolling speeds of vaccines, AI could be utilized to our advantage in order to assist us with the fight against COVID-19. This review briefly discusses the application of AI in the COVID-19 situation for various health benefits.
Collapse
|
33
|
Gupta R, Srivastava D, Sahu M, Tiwari S, Ambasta RK, Kumar P. Artificial intelligence to deep learning: machine intelligence approach for drug discovery. Mol Divers 2021; 25:1315-1360. [PMID: 33844136 PMCID: PMC8040371 DOI: 10.1007/s11030-021-10217-3] [Citation(s) in RCA: 228] [Impact Index Per Article: 76.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Accepted: 03/22/2021] [Indexed: 02/06/2023]
Abstract
Drug designing and development is an important area of research for pharmaceutical companies and chemical scientists. However, low efficacy, off-target delivery, time consumption, and high cost impose a hurdle and challenges that impact drug design and discovery. Further, complex and big data from genomics, proteomics, microarray data, and clinical trials also impose an obstacle in the drug discovery pipeline. Artificial intelligence and machine learning technology play a crucial role in drug discovery and development. In other words, artificial neural networks and deep learning algorithms have modernized the area. Machine learning and deep learning algorithms have been implemented in several drug discovery processes such as peptide synthesis, structure-based virtual screening, ligand-based virtual screening, toxicity prediction, drug monitoring and release, pharmacophore modeling, quantitative structure-activity relationship, drug repositioning, polypharmacology, and physiochemical activity. Evidence from the past strengthens the implementation of artificial intelligence and deep learning in this field. Moreover, novel data mining, curation, and management techniques provided critical support to recently developed modeling algorithms. In summary, artificial intelligence and deep learning advancements provide an excellent opportunity for rational drug design and discovery process, which will eventually impact mankind. The primary concern associated with drug design and development is time consumption and production cost. Further, inefficiency, inaccurate target delivery, and inappropriate dosage are other hurdles that inhibit the process of drug delivery and development. With advancements in technology, computer-aided drug design integrating artificial intelligence algorithms can eliminate the challenges and hurdles of traditional drug design and development. Artificial intelligence is referred to as superset comprising machine learning, whereas machine learning comprises supervised learning, unsupervised learning, and reinforcement learning. Further, deep learning, a subset of machine learning, has been extensively implemented in drug design and development. The artificial neural network, deep neural network, support vector machines, classification and regression, generative adversarial networks, symbolic learning, and meta-learning are examples of the algorithms applied to the drug design and discovery process. Artificial intelligence has been applied to different areas of drug design and development process, such as from peptide synthesis to molecule design, virtual screening to molecular docking, quantitative structure-activity relationship to drug repositioning, protein misfolding to protein-protein interactions, and molecular pathway identification to polypharmacology. Artificial intelligence principles have been applied to the classification of active and inactive, monitoring drug release, pre-clinical and clinical development, primary and secondary drug screening, biomarker development, pharmaceutical manufacturing, bioactivity identification and physiochemical properties, prediction of toxicity, and identification of mode of action.
Collapse
Affiliation(s)
- Rohan Gupta
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Devesh Srivastava
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Mehar Sahu
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Swati Tiwari
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Rashmi K Ambasta
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Pravir Kumar
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India.
| |
Collapse
|
34
|
Vatansever S, Schlessinger A, Wacker D, Kaniskan HÜ, Jin J, Zhou M, Zhang B. Artificial intelligence and machine learning-aided drug discovery in central nervous system diseases: State-of-the-arts and future directions. Med Res Rev 2021; 41:1427-1473. [PMID: 33295676 PMCID: PMC8043990 DOI: 10.1002/med.21764] [Citation(s) in RCA: 83] [Impact Index Per Article: 27.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 10/30/2020] [Accepted: 11/20/2020] [Indexed: 01/11/2023]
Abstract
Neurological disorders significantly outnumber diseases in other therapeutic areas. However, developing drugs for central nervous system (CNS) disorders remains the most challenging area in drug discovery, accompanied with the long timelines and high attrition rates. With the rapid growth of biomedical data enabled by advanced experimental technologies, artificial intelligence (AI) and machine learning (ML) have emerged as an indispensable tool to draw meaningful insights and improve decision making in drug discovery. Thanks to the advancements in AI and ML algorithms, now the AI/ML-driven solutions have an unprecedented potential to accelerate the process of CNS drug discovery with better success rate. In this review, we comprehensively summarize AI/ML-powered pharmaceutical discovery efforts and their implementations in the CNS area. After introducing the AI/ML models as well as the conceptualization and data preparation, we outline the applications of AI/ML technologies to several key procedures in drug discovery, including target identification, compound screening, hit/lead generation and optimization, drug response and synergy prediction, de novo drug design, and drug repurposing. We review the current state-of-the-art of AI/ML-guided CNS drug discovery, focusing on blood-brain barrier permeability prediction and implementation into therapeutic discovery for neurological diseases. Finally, we discuss the major challenges and limitations of current approaches and possible future directions that may provide resolutions to these difficulties.
Collapse
Affiliation(s)
- Sezen Vatansever
- Department of Genetics and Genomic SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Transformative Disease ModelingIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Icahn Institute for Data Science and Genomic TechnologyIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Avner Schlessinger
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Therapeutics DiscoveryIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Daniel Wacker
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Therapeutics DiscoveryIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of NeuroscienceIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - H. Ümit Kaniskan
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Therapeutics DiscoveryIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of Oncological Sciences, Tisch Cancer InstituteIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Jian Jin
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Therapeutics DiscoveryIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of Oncological Sciences, Tisch Cancer InstituteIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Ming‐Ming Zhou
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of Oncological Sciences, Tisch Cancer InstituteIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| | - Bin Zhang
- Department of Genetics and Genomic SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Mount Sinai Center for Transformative Disease ModelingIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Icahn Institute for Data Science and Genomic TechnologyIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
- Department of Pharmacological SciencesIcahn School of Medicine at Mount SinaiNew YorkNew YorkUSA
| |
Collapse
|
35
|
Liu X, Zhang H, Xue Q, Pan W, Zhang A. In silico health effect prioritization of environmental chemicals through transcriptomics data exploration from a chemo-centric view. Sci Total Environ 2021; 762:143082. [PMID: 33143927 DOI: 10.1016/j.scitotenv.2020.143082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2020] [Revised: 10/11/2020] [Accepted: 10/11/2020] [Indexed: 06/11/2023]
Abstract
With the explosive growth of synthetic compounds, the health effects caused by exogenous chemical exposure have attracted more and more public attention. The prediction of health effect is a never-ending story. Collective resource of transcriptomics data offers an opportunity to understand and identify the multiple health effects of small molecule. Inspired by the fact that environmental chemicals of high health risk frequently share both similar gene expression profile and common structural feature of certain drugs, we here propose a novel computational effect prioritization method for environmental chemicals through transcriptomics data exploration from a chemo-centric view. Specifically, non-negative matrix factorization (NMF) method has been adopted to get the association network linking structural features with transcriptomics characteristics of drugs with specific effects. The model yields 13 pivotal types of effects, so-called components, that represent drug categories with common chemo- and geno- type features. Moreover, the established model effectively prioritizes potential toxic effects for the external chemicals from the endocrine disruptor screening program (EDSP) for their potential estrogenicity and other verified risks. Even if only the highest priority is set for the estrogenic effect, the precision and recall can reach 0.76 and 0.77 respectively for these chemicals. Our effort provides a successful endeavor as to profile potential toxic effects simultaneously for environmental chemicals using both chemical and omics data.
Collapse
Affiliation(s)
- Xian Liu
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China; College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100190, PR China.
| | - Huazhou Zhang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China; College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100190, PR China.
| | - Qiao Xue
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China.
| | - Wenxiao Pan
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China.
| | - Aiqian Zhang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China; College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100190, PR China; Institute of Environment and Health, Jianghan University, Wuhan 430056, PR China.
| |
Collapse
|
36
|
Sifain AE, Rice BM, Yalkowsky SH, Barnes BC. Machine learning transition temperatures from 2D structure. J Mol Graph Model 2021; 105:107848. [PMID: 33667863 DOI: 10.1016/j.jmgm.2021.107848] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Revised: 01/11/2021] [Accepted: 01/19/2021] [Indexed: 10/22/2022]
Abstract
A priori knowledge of physicochemical properties such as melting and boiling could expedite materials discovery. However, theoretical modeling from first principles poses a challenge for efficient virtual screening of potential candidates. As an alternative, the tools of data science are becoming increasingly important for exploring chemical datasets and predicting material properties. Herein, we extend a molecular representation, or set of descriptors, first developed for quantitative structure-property relationship modeling by Yalkowsky and coworkers known as the Unified Physicochemical Property Estimation Relationships (UPPER). This molecular representation has group-constitutive and geometrical descriptors that map to enthalpy and entropy; two thermodynamic quantities that drive thermal phase transitions. We extend the UPPER representation to include additional information about sp2-bonded fragments. Additionally, instead of using the UPPER descriptors in a series of thermodynamically-inspired calculations, as per Yalkowsky, we use the descriptors to construct a vector representation for use with machine learning techniques. The concise and easy-to-compute representation, combined with a gradient-boosting decision tree model, provides an appealing framework for predicting experimental transition temperatures in a diverse chemical space. An application to energetic materials shows that the method is predictive, despite a relatively modest energetics reference dataset. We also report competitive results on diverse public datasets of melting points (i.e., OCHEM, Enamine, Bradley, and Bergström) comprised of over 47k structures. Open source software is available at https://github.com/USArmyResearchLab/ARL-UPPER.
Collapse
Affiliation(s)
- Andrew E Sifain
- CCDC Army Research Laboratory, Aberdeen Proving Ground, MD, 21005, USA
| | - Betsy M Rice
- CCDC Army Research Laboratory, Aberdeen Proving Ground, MD, 21005, USA
| | - Samuel H Yalkowsky
- Department of Pharmaceutics, College of Pharmacy, University of Arizona, Tucson, AZ, 85721, USA
| | - Brian C Barnes
- CCDC Army Research Laboratory, Aberdeen Proving Ground, MD, 21005, USA.
| |
Collapse
|
37
|
Abstract
Data science has revolutionized chemical research and continues to break down barriers with new interdisciplinary studies. The introduction of computational models and machine learning (ML) algorithms in combination with automation and traditional experimental techniques has enabled scientific advancement across nearly every discipline of chemistry, from materials discovery, to process optimization, to synthesis planning. However, predictive tools powered by data science are only as good as their data sets and, currently, many of the data sets used to train models suffer from several limitations, including being sparse, limited in scope and requiring human curation. Likewise, computational data faces limitations in terms of accurate modeling of nonideal systems and can suffer from low translation fidelity from simulation to real conditions. The lack of diverse data and the need to be able to test it experimentally reduces both the accuracy and scope of the predictive models derived from data science. This Account contextualizes the need for more complex and diverse experimental data and highlights how the seamless integration of robotics, machine learning, and data-rich monitoring techniques can be used to access it with minimal human labor.We propose three broad categories of data in chemistry: data on fundamental properties, data on reaction outcomes, and data on reaction mechanics. We highlight flexible, automated platforms that can be deployed to acquire and leverage these data. The first platform combines solid- and liquid-dosing modules with computer vision to automate solubility screening, thereby gathering fundamental data that are necessary for almost every experimental design. Using computer vision offers the additional benefit of creating a visual record, which can be referenced and used to further interrogate and gain insight on the data collected. The second platform iteratively tests reaction variables proposed by a ML algorithm in a closed-loop fashion. Experimental data related to reaction outcomes are fed back into the algorithm to drive the discovery and optimization of new materials and chemical processes. The third platform uses automated process analytical technology to gather real-time data related to reaction kinetics. This system allows the researcher to directly interrogate the reaction mechanisms in granular detail to determine exactly how and why a reaction proceeds, thereby enabling reaction optimization and deployment.
Collapse
Affiliation(s)
- Yao Shi
- Department of Chemistry, University of British Columbia, 2036 Main Mall, Vancouver, British Columbia V6T 1Z3, Canada
| | - Paloma L. Prieto
- Department of Chemistry, University of British Columbia, 2036 Main Mall, Vancouver, British Columbia V6T 1Z3, Canada
| | - Tara Zepel
- Department of Chemistry, University of British Columbia, 2036 Main Mall, Vancouver, British Columbia V6T 1Z3, Canada
| | - Shad Grunert
- Department of Chemistry, University of British Columbia, 2036 Main Mall, Vancouver, British Columbia V6T 1Z3, Canada
| | - Jason E. Hein
- Department of Chemistry, University of British Columbia, 2036 Main Mall, Vancouver, British Columbia V6T 1Z3, Canada
| |
Collapse
|
38
|
Paul D, Sanap G, Shenoy S, Kalyane D, Kalia K, Tekade RK. Artificial intelligence in drug discovery and development. Drug Discov Today 2021; 26:80-93. [PMID: 33099022 PMCID: PMC7577280 DOI: 10.1016/j.drudis.2020.10.010] [Citation(s) in RCA: 281] [Impact Index Per Article: 93.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 09/03/2020] [Accepted: 10/13/2020] [Indexed: 02/07/2023]
Abstract
Artificial intelligence-integrated drug discovery and development has accelerated the growth of the pharmaceutical sector, leading to a revolutionary change in the pharma industry. Here, we discuss areas of integration, tools, and techniques utilized in enforcing AI, ongoing challenges, and ways to overcome them.
Collapse
Affiliation(s)
- Debleena Paul
- National Institute of Pharmaceutical Education and Research-Ahmedabad (NIPER-A), An Institute of National Importance, Government of India, Department of Pharmaceuticals, Ministry of Chemicals and Fertilizers, Palaj, Opp. Air Force Station, Gandhinagar, 382355, Gujarat, India
| | - Gaurav Sanap
- National Institute of Pharmaceutical Education and Research-Ahmedabad (NIPER-A), An Institute of National Importance, Government of India, Department of Pharmaceuticals, Ministry of Chemicals and Fertilizers, Palaj, Opp. Air Force Station, Gandhinagar, 382355, Gujarat, India
| | - Snehal Shenoy
- National Institute of Pharmaceutical Education and Research-Ahmedabad (NIPER-A), An Institute of National Importance, Government of India, Department of Pharmaceuticals, Ministry of Chemicals and Fertilizers, Palaj, Opp. Air Force Station, Gandhinagar, 382355, Gujarat, India
| | - Dnyaneshwar Kalyane
- National Institute of Pharmaceutical Education and Research-Ahmedabad (NIPER-A), An Institute of National Importance, Government of India, Department of Pharmaceuticals, Ministry of Chemicals and Fertilizers, Palaj, Opp. Air Force Station, Gandhinagar, 382355, Gujarat, India
| | - Kiran Kalia
- National Institute of Pharmaceutical Education and Research-Ahmedabad (NIPER-A), An Institute of National Importance, Government of India, Department of Pharmaceuticals, Ministry of Chemicals and Fertilizers, Palaj, Opp. Air Force Station, Gandhinagar, 382355, Gujarat, India
| | - Rakesh K Tekade
- National Institute of Pharmaceutical Education and Research-Ahmedabad (NIPER-A), An Institute of National Importance, Government of India, Department of Pharmaceuticals, Ministry of Chemicals and Fertilizers, Palaj, Opp. Air Force Station, Gandhinagar, 382355, Gujarat, India.
| |
Collapse
|
39
|
McCord JP, Strynar MJ, Washington JW, Bergman EL, Goodrow SM. Emerging Chlorinated Polyfluorinated Polyether Compounds Impacting the Waters of Southwestern New Jersey Identified by Use of Nontargeted Analysis. Environ Sci Technol Lett 2020; 7:903-908. [PMID: 33553465 PMCID: PMC7863629 DOI: 10.1021/acs.estlett.0c00640] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Per- and polyfluoroalkyl substances (PFAS) are a widespread, environmentally persistent class of anthropogenic chemicals that are widely used in industrial and consumer products and frequently detected in environmental media. Potential human health impacts from long-term exposure to legacy PFAS resulted in the industrial development and use of numerous replacement species in recent decades. Environmental investigative activities have been crucial in identifying the existence and environmental transport of emerging PFAS in environmental media. Previous investigations in an industrially impacted region of southwestern New Jersey has shown consistently elevated levels of legacy PFAS, motivating additional examination by non-targeted mass spectrometry to identify emerging PFAS contamination. This study applied non-targeted analysis to water samples collected in Gloucester and Salem Counties in southwestern New Jersey, revealing the existence of a series of novel chloro-perfluoro-polyether carboxylates and related PFAS species originating from an industrial PFAS user in the region. There is sparse publicly available toxicity information for the emerging chemical species, but estimated concentrations exceeded the state drinking water standards for perfluorooctanoic acid (PFOA) and perfluorononanoic acid (PFNA). Non-targeted analysis was used to estimate the effectiveness of point-of-entry water treatment systems for removal of the emerging species and reduced the abundance of PFAS by >90%.
Collapse
Affiliation(s)
- James P McCord
- US Environmental Protection Agency, Office of Research and Development, Center for Environmental Measurement and Modeling, Research Triangle Park, NC
| | - Mark J Strynar
- US Environmental Protection Agency, Office of Research and Development, Center for Environmental Measurement and Modeling, Research Triangle Park, NC
| | - John W Washington
- US Environmental Protection Agency, Office of Research and Development, Center for Environmental Measurement and Modeling, Athens, GA
| | - Erica L Bergman
- New Jersey Department of Environmental Protection, Division of Remediation Management, Trenton, NJ
| | - Sandra M Goodrow
- New Jersey Department of Environmental Protection, Division of Science and Research, Trenton, NJ
| |
Collapse
|
40
|
Li D, Sangion A, Li L. Evaluating consumer exposure to disinfecting chemicals against coronavirus disease 2019 (COVID-19) and associated health risks. Environ Int 2020; 145:106108. [PMID: 32927283 PMCID: PMC7470762 DOI: 10.1016/j.envint.2020.106108] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Revised: 08/03/2020] [Accepted: 08/31/2020] [Indexed: 05/19/2023]
Abstract
Disinfection of surfaces has been recommended as one of the most effective ways to combat the spread of novel coronavirus (SARS-CoV-2) that causes coronavirus disease 2019 (COVID-19). However, overexposure to disinfecting chemicals may lead to unintended human health risks. Here, using an indoor fate and chemical exposure model, we estimate human exposure to 22 disinfecting chemicals on the lists recommended by various governmental agencies against COVID-19, resulting from contact with disinfected surfaces and handwashing. Three near-field exposure routes, i.e., mouthing-mediated oral ingestion, inhalation, and dermal absorption, are considered to calculate the whole-body uptake doses and blood concentrations caused by single use per day for three age groups (3, 14, and 24-year-old). We also assess the health risks by comparing the predicted whole-body uptake doses with in vivo toxicological data and the predicted blood concentrations with in vitro bioactivity data. Our results indicate that both the total exposure and relative contribution of each exposure route vary considerably among the disinfecting chemicals due to their diverse physicochemical properties. 3-year-old children have consistent higher exposure than other age groups, especially in the scenario of contact with disinfected surfaces, due to their more frequent hand contact and mouthing activities. Due to the short duration of handwashing, we do not expect any health risk from the use of disinfecting chemicals in handwashing. In contrast, exposure from contact with disinfected surfaces may result in health risks for certain age groups especially children, even the surfaces are disinfected once a day. Interestingly, risk assessments based on whole-body uptake doses and in vivo toxicological data tend to give higher risk estimates than do those based on blood concentrations and in vitro bioactivity data. Our results reveal the most important exposure routes for disinfecting chemicals used in the indoor environment; they also highlight the need for more accurate data for both chemical properties and toxicity to better understand the risks associated with the increased use of disinfecting chemicals in the pandemic.
Collapse
Affiliation(s)
- Dingsheng Li
- School of Community Health Sciences, University of Nevada Reno, Reno, NV 89557-274, United States
| | - Alessandro Sangion
- Department of Physical and Environmental Sciences, University of Toronto Scarborough, Toronto, Ontario M1C 1A4, Canada
| | - Li Li
- School of Community Health Sciences, University of Nevada Reno, Reno, NV 89557-274, United States.
| |
Collapse
|
41
|
Le T, Winter R, Noé F, Clevert DA. Neuraldecipher - reverse-engineering extended-connectivity fingerprints (ECFPs) to their molecular structures. Chem Sci 2020; 11:10378-10389. [PMID: 34094299 PMCID: PMC8162443 DOI: 10.1039/d0sc03115a] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Accepted: 09/10/2020] [Indexed: 12/22/2022] Open
Abstract
Protecting molecular structures from disclosure against external parties is of great relevance for industrial and private associations, such as pharmaceutical companies. Within the framework of external collaborations, it is common to exchange datasets by encoding the molecular structures into descriptors. Molecular fingerprints such as the extended-connectivity fingerprints (ECFPs) are frequently used for such an exchange, because they typically perform well on quantitative structure-activity relationship tasks. ECFPs are often considered to be non-invertible due to the way they are computed. In this paper, we present a fast reverse-engineering method to deduce the molecular structure given revealed ECFPs. Our method includes the Neuraldecipher, a neural network model that predicts a compact vector representation of compounds, given ECFPs. We then utilize another pre-trained model to retrieve the molecular structure as SMILES representation. We demonstrate that our method is able to reconstruct molecular structures to some extent, and improves, when ECFPs with larger fingerprint sizes are revealed. For example, given ECFP count vectors of length 4096, we are able to correctly deduce up to 69% of molecular structures on a validation set (112 K unique samples) with our method.
Collapse
Affiliation(s)
- Tuan Le
- Department of Digital Technologies, Bayer AG Berlin Germany
- Department of Mathematics and Computer Science, Freie Universität Berlin Berlin Germany
| | - Robin Winter
- Department of Digital Technologies, Bayer AG Berlin Germany
- Department of Mathematics and Computer Science, Freie Universität Berlin Berlin Germany
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin Berlin Germany
| | | |
Collapse
|
42
|
Abstract
Thousands of anthropogenic chemicals are released into the environment each year, posing potential hazards to human and environmental health. Toxic chemicals may cause a variety of adverse health effects, triggering immediate symptoms or delayed effects over longer periods of time. It is thus crucial to develop methods that can rapidly screen and predict the toxicity of chemicals to limit the potential harmful impacts of chemical pollutants. Computational methods are being increasingly used in toxicity predictions. Here, the method of molecular docking is assessed for screening potential toxicity of a variety of xenobiotic compounds, including pesticides, pharmaceuticals, pollutants, and toxins derived from the chemical industry. The method predicts the binding energy of pollutants to a set of carefully selected receptors under the assumption that toxicity in many cases is related to interference with biochemical pathways. The strength of the applied method lies in its rapid generation of interaction maps between potential toxins and the targeted enzymes, which could quickly yield molecular-level information and insight into potential perturbation pathways, aiding in the prioritization of chemicals for further tests. Two scoring functions are compared: Autodock Vina and the machine-learning scoring function RF-Score-VS. The results are promising, although hampered by the accuracy of the scoring functions. The strengths and weaknesses of the docking protocol are discussed, as well as future directions for improving the accuracy for the purpose of toxicity predictions.
Collapse
Affiliation(s)
- Natasha Kamerlin
- Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, Box 596, SE-751 24 Uppsala, Sweden
| | - Mickaël G Delcey
- Department of Chemistry-Ångström Laboratory, Uppsala University, SE-75120 Uppsala, Sweden
| | - Sergio Manzetti
- Institute for Science and Technology, Fjordforsk A.S., Midtun, 6894 Vangsnes, Norway
| | - David van der Spoel
- Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, Box 596, SE-751 24 Uppsala, Sweden
| |
Collapse
|
43
|
Kowalewski J, Ray A. Predicting novel drugs for SARS-CoV-2 using machine learning from a >10 million chemical space. Heliyon 2020; 6:e04639. [PMID: 32802980 PMCID: PMC7409807 DOI: 10.1016/j.heliyon.2020.e04639] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Revised: 06/16/2020] [Accepted: 08/03/2020] [Indexed: 12/19/2022] Open
Abstract
There is an urgent need for the identification of effective therapeutics for COVID-19 and we have developed a machine learning drug discovery pipeline to identify several drug candidates. First, we collect assay data for 65 target human proteins known to interact with the SARS-CoV-2 proteins, including the ACE2 receptor. Next, we train machine learning models to predict inhibitory activity and use them to screen FDA registered chemicals and approved drugs (~100,000) and ~14 million purchasable chemicals. We filter predictions according to estimated mammalian toxicity and vapor pressure. Prospective volatile candidates are proposed as novel inhaled therapeutics since the nasal cavity and respiratory tracts are early bottlenecks for infection. We also identify candidates that act across multiple targets as promising for future analyses. We anticipate that this theoretical study can accelerate testing of two categories of therapeutics: repurposed drugs suited for short-term approval, and novel efficacious drugs suitable for a long-term follow up.
Collapse
Affiliation(s)
- Joel Kowalewski
- Interdepartmental Neuroscience Program, University of California, Riverside, CA 92521, USA
| | - Anandasankar Ray
- Interdepartmental Neuroscience Program, University of California, Riverside, CA 92521, USA
- Department of Molecular, Cell and Systems Biology, University of California, Riverside, CA 92521, USA
| |
Collapse
|
44
|
Sivaraman G, Jackson NE, Sanchez-Lengeling B, Vázquez-Mayagoitia Á, Aspuru-Guzik A, Vishwanath V, de Pablo JJ. A machine learning workflow for molecular analysis: application to melting points. Mach Learn : Sci Technol 2020. [DOI: 10.1088/2632-2153/ab8aa3] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Abstract
Computational tools encompassing integrated molecular prediction, analysis, and generation are key for molecular design in a variety of critical applications. In this work, we develop a workflow for molecular analysis (MOLAN) that integrates an ensemble of supervised and unsupervised machine learning techniques to analyze molecular data sets. The MOLAN workflow combines molecular featurization, clustering algorithms, uncertainty analysis, low-bias dataset construction, high-performance regression models, graph-based molecular embeddings and attribution, and a semi-supervised variational autoencoder based on the novel SELFIES representation to enable molecular design. We demonstrate the utility of the MOLAN workflow in the context of a challenging multi-molecule property prediction problem: the determination of melting points solely from single molecule structure. This application serves as a case study for how to employ the MOLAN workflow in the context of molecular property prediction.
Collapse
|
45
|
Venkatraman V. Evaluation of Molecular Fingerprints for Determining Dye Aggregation on Semiconductor Surfaces. Mol Inform 2020; 41:e2000062. [PMID: 32476288 DOI: 10.1002/minf.202000062] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Accepted: 05/31/2020] [Indexed: 01/19/2023]
Abstract
Dye aggregation plays an important role in determining the photovoltaic performance of dye sensitized solar cells. Compared with the spectra observed in solution, it is, apriori, difficult to ascertain whether a dye is likely to show hypsochromic (H) or bathochromic (J) aggregation, until after adsorption onto the semiconductor electrode. Herein, we show that molecular fingerprint-based methods provide a fast and efficient way to discriminate between H- and J-aggregating dyes. The efficacy of the fingerprint-based classification models is demonstrated with a diverse set of over 3000 organic dyes dissolved in different solvents. Requiring only the structure of the dye and the polarity of the solvent used, the machine learning model achieves close to 80 % classification accuracies that are comparable with models based on a combination of fragment counts and topological indices. For interested researchers, we have bundled the prediction tools as an R package.
Collapse
|
46
|
Eichler CMA, Little JC. A framework to model exposure to per- and polyfluoroalkyl substances in indoor environments. Environ Sci Process Impacts 2020; 22:500-511. [PMID: 32141451 DOI: 10.1039/c9em00556k] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Per- and polyfluoroalkyl substances (PFAS) include a wide range of halogenated chemicals, which have been used as water- and stain-resistant coatings for consumer products and industrial purposes. PFAS are persistent in the environment and several are bioaccumulative, and thus relevant for human and environmental health. Given their pervasiveness, we need to understand how we are exposed to PFAS, especially in indoor environments where many people spend most of their time. Research on indoor exposure to semivolatile organic compounds (SVOCs) has progressed rapidly in recent years. Because many PFAS can be considered SVOCs, much of what has been learned about SVOCs may be used to guide research on PFAS exposure in indoor environments. Here, we briefly review what has been done to assess indoor exposure to PFAS. Then, we propose a systematic indoor exposure framework for PFAS based on methods to estimate exposure to SVOCs. We illustrate how critical parameters such as partition coefficients for different media (particles, dust, surfaces, and clothing) for different types of PFAS could be measured, how these measurements can be used in exposure models for PFAS, and how fundamental, predictive relationships might be used to estimate necessary parameters for emerging compounds.
Collapse
Affiliation(s)
- Clara M A Eichler
- Department of Environmental Sciences and Engineering, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
| | | |
Collapse
|
47
|
Duchowicz PR. QSPR studies on water solubility, octanol-water partition coefficient and vapour pressure of pesticides. SAR QSAR Environ Res 2020; 31:135-148. [PMID: 31842624 DOI: 10.1080/1062936x.2019.1699602] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Accepted: 11/27/2019] [Indexed: 06/10/2023]
Abstract
The assessment of the environmental fate and (eco)toxicological effects of pesticide compounds is of crucial importance. The present review is focused on Quantitative Structure-Property Relationships (QSPR) applications on three environmentally relevant physicochemical properties of pesticides, which can be used for assessing their environmental partition and transport, as well as exposure potential namely water solubility, octanol-water partition coefficient and vapour pressure. This article revises various interesting QSPR applications with special emphasis on studies developed during the 2009-2019 period.
Collapse
Affiliation(s)
- P R Duchowicz
- Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas (INIFTA), CONICET, UNLP, La Plata, Argentina
| |
Collapse
|
48
|
Lui R, Guan D, Matthews S. A comparison of molecular representations for lipophilicity quantitative structure-property relationships with results from the SAMPL6 logP Prediction Challenge. J Comput Aided Mol Des 2020; 34:523-534. [PMID: 31933037 DOI: 10.1007/s10822-020-00279-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Accepted: 01/08/2020] [Indexed: 12/20/2022]
Abstract
Effective representation of a molecule is required to develop useful quantitative structure-property relationships (QSPR) for accurate prediction of chemical properties. The octanol-water partition coefficient logP, a measure of lipophilicity, is an important property for pharmacological and toxicological endpoints used in the pharmaceutical and regulatory spheres. We compare physicochemical descriptors, structural keys, and circular fingerprints in their ability to effectively represent a chemical space and characterise molecular features to correlate with lipophilicity. Exploratory landscape continuity analyses revealed that whole-molecule physicochemical descriptors could map together compounds that were similar in both molecular features and logP, indicating higher potential for use in logP QSPRs compared to the substructural approach of structural keys and circular fingerprints. Indeed, logP QSPR models parameterised by physicochemical descriptors consistently performed with the lowest error. Our best performing model was a stochastic gradient descent-optimised multilinear regression with 1438 descriptors, returning an internal benchmark RMSE of 1.03 log units. This corroborates the well-established notion that lipophilicity is an additive, whole-molecule property. We externally tested the model by participating in the 2019 SAMPL6 logP Prediction Challenge and blindly predicting for 11 protein kinase inhibitor fragment-like molecules. Our model returned an RMSE of 0.49 log units, placing eighth overall and third in the empirical methods category (submission ID 'hdpuj'). Permutation feature importance analyses revealed that physicochemical descriptors could characterise predictive molecular features highly relevant to the kinase inhibitor fragment-like molecules.
Collapse
Affiliation(s)
- Raymond Lui
- Pharmacoinformatics Laboratory, Discipline of Pharmacology, School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW, 2006, Australia
| | - Davy Guan
- Pharmacoinformatics Laboratory, Discipline of Pharmacology, School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW, 2006, Australia
| | - Slade Matthews
- Pharmacoinformatics Laboratory, Discipline of Pharmacology, School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW, 2006, Australia.
| |
Collapse
|
49
|
Shin HK. Electron configuration-based neural network model to predict physicochemical properties of inorganic compounds. RSC Adv 2020; 10:33268-33278. [PMID: 35515036 PMCID: PMC9056678 DOI: 10.1039/d0ra05873d] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 09/01/2020] [Indexed: 11/21/2022] Open
Abstract
Registration, evaluation, and authorization of chemicals (REACH), the regulation of chemicals in use, imposes the characterization and report of the physicochemical properties of compounds. To cope with the financial burden of the experiments, the use of computational models is permitted for prediction of properties. Although a number of physicochemical property prediction models have been developed, their applicability domain is limited to organic molecules since most available data are concerned with organic molecules, and most of the molecular descriptors are restricted to organic molecule calculations. Prediction models developed for inorganic compounds were intended to predict endpoints relevant to novel material design. Therefore, no models were available for predicting endpoints of inorganic compounds that are significant to regulatory perspectives. In this study, boiling point, water solubility, melting point, and pyrolysis point prediction models were developed for inorganic compounds based on their composition. The electron configuration of each element in the molecule was used as a descriptor in this study. The dataset covered a wide range of endpoints and diverse elements in their structure. The performance of the models was measured using R2, mean absolute error, and Spearman's correlation coefficient, and indicated good prediction accuracy of continuous endpoints and prioritization of inorganic compounds. Registration, evaluation, and authorization of chemicals (REACH), the regulation of chemicals in use, imposes the characterization and report of the physicochemical properties of compounds.![]()
Collapse
Affiliation(s)
- Hyun Kil Shin
- Toxicoinformatics Group
- Department of Predictive Toxicology
- Korea Institute of Toxicology
- Daejeon
- Republic of Korea
| |
Collapse
|
50
|
Montanari F, Kuhnke L, Ter Laak A, Clevert DA. Modeling Physico-Chemical ADMET Endpoints with Multitask Graph Convolutional Networks. Molecules 2019; 25:E44. [PMID: 31877719 DOI: 10.3390/molecules25010044] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2019] [Revised: 12/19/2019] [Accepted: 12/20/2019] [Indexed: 11/19/2022] Open
Abstract
Simple physico-chemical properties, like logD, solubility, or melting point, can reveal a great deal about how a compound under development might later behave. These data are typically measured for most compounds in drug discovery projects in a medium throughput fashion. Collecting and assembling all the Bayer in-house data related to these properties allowed us to apply powerful machine learning techniques to predict the outcome of those assays for new compounds. In this paper, we report our finding that, especially for predicting physicochemical ADMET endpoints, a multitask graph convolutional approach appears a highly competitive choice. For seven endpoints of interest, we compared the performance of that approach to fully connected neural networks and different single task models. The new model shows increased predictive performance compared to previous modeling methods and will allow early prioritization of compounds even before they are synthesized. In addition, our model follows the generalized solubility equation without being explicitly trained under this constraint.
Collapse
|