Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Morris P, St. Clair R, Hahn WE, Barenholtz E. Predicting Binding from Screening Assays with Transformer Network Embeddings. J Chem Inf Model 2020;60:4191-4199. [DOI: 10.1021/acs.jcim.9b01212] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

For:	Morris P, St. Clair R, Hahn WE, Barenholtz E. Predicting Binding from Screening Assays with Transformer Network Embeddings. J Chem Inf Model 2020;60:4191-4199. [DOI: 10.1021/acs.jcim.9b01212] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Number

Cited by Other Article(s)

Vittorio S, Lunghini F, Morerio P, Gadioli D, Orlandini S, Silva P, Jan Martinovic, Pedretti A, Bonanni D, Del Bue A, Palermo G, Vistoli G, Beccari AR. Addressing docking pose selection with structure-based deep learning: Recent advances, challenges and opportunities. Comput Struct Biotechnol J 2024;23:2141-2151. [PMID: 38827235 PMCID: PMC11141151 DOI: 10.1016/j.csbj.2024.05.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 05/15/2024] [Accepted: 05/15/2024] [Indexed: 06/04/2024] Open

Luong KD, Singh A. Application of Transformers in Cheminformatics. J Chem Inf Model 2024;64:4392-4409. [PMID: 38815246 PMCID: PMC11167597 DOI: 10.1021/acs.jcim.3c02070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Revised: 04/05/2024] [Accepted: 05/06/2024] [Indexed: 06/01/2024]

Choi S, Lee J, Seo J, Han SW, Lee SH, Seo JH, Seok J. Automated BigSMILES conversion workflow and dataset for homopolymeric macromolecules. Sci Data 2024;11:371. [PMID: 38605036 PMCID: PMC11009387 DOI: 10.1038/s41597-024-03212-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 04/02/2024] [Indexed: 04/13/2024] Open

Daza D, Alivanistos D, Mitra P, Pijnenburg T, Cochez M, Groth P. BioBLP: a modular framework for learning on multimodal biomedical knowledge graphs. J Biomed Semantics 2023;14:20. [PMID: 38066573 PMCID: PMC10709903 DOI: 10.1186/s13326-023-00301-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 11/29/2023] [Indexed: 12/18/2023] Open

Abstract

BACKGROUND

Knowledge graphs (KGs) are an important tool for representing complex relationships between entities in the biomedical domain. Several methods have been proposed for learning embeddings that can be used to predict new links in such graphs. Some methods ignore valuable attribute data associated with entities in biomedical KGs, such as protein sequences, or molecular graphs. Other works incorporate such data, but assume that entities can be represented with the same data modality. This is not always the case for biomedical KGs, where entities exhibit heterogeneous modalities that are central to their representation in the subject domain.

OBJECTIVE

We aim to understand how to incorporate multimodal data into biomedical KG embeddings, and analyze the resulting performance in comparison with traditional methods. We propose a modular framework for learning embeddings in KGs with entity attributes, that allows encoding attribute data of different modalities while also supporting entities with missing attributes. We additionally propose an efficient pretraining strategy for reducing the required training runtime. We train models using a biomedical KG containing approximately 2 million triples, and evaluate the performance of the resulting entity embeddings on the tasks of link prediction, and drug-protein interaction prediction, comparing against methods that do not take attribute data into account.

RESULTS

In the standard link prediction evaluation, the proposed method results in competitive, yet lower performance than baselines that do not use attribute data. When evaluated in the task of drug-protein interaction prediction, the method compares favorably with the baselines. Further analyses show that incorporating attribute data does outperform baselines over entities below a certain node degree, comprising approximately 75% of the diseases in the graph. We also observe that optimizing attribute encoders is a challenging task that increases optimization costs. Our proposed pretraining strategy yields significantly higher performance while reducing the required training runtime.

CONCLUSION

BioBLP allows to investigate different ways of incorporating multimodal biomedical data for learning representations in KGs. With a particular implementation, we find that incorporating attribute data does not consistently outperform baselines, but improvements are obtained on a comparatively large subset of entities below a specific node-degree. Our results indicate a potential for improved performance in scientific discovery tasks where understudied areas of the KG would benefit from link prediction methods.

Collapse

Bao H, Zhao J, Zhao X, Zhao C, Lu X, Xu G. Prediction of plant secondary metabolic pathways using deep transfer learning. BMC Bioinformatics 2023;24:348. [PMID: 37726702 PMCID: PMC10507959 DOI: 10.1186/s12859-023-05485-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 09/14/2023] [Indexed: 09/21/2023] Open

Abstract

BACKGROUND

Plant secondary metabolites are highly valued for their applications in pharmaceuticals, nutrition, flavors, and aesthetics. It is of great importance to elucidate plant secondary metabolic pathways due to their crucial roles in biological processes during plant growth and development. However, understanding plant biosynthesis and degradation pathways remains a challenge due to the lack of sufficient information in current databases. To address this issue, we proposed a transfer learning approach using a pre-trained hybrid deep learning architecture that combines Graph Transformer and convolutional neural network (GTC) to predict plant metabolic pathways.

RESULTS

GTC provides comprehensive molecular representation by extracting both structural features from the molecular graph and textual information from the SMILES string. GTC is pre-trained on the KEGG datasets to acquire general features, followed by fine-tuning on plant-derived datasets. Four metrics were chosen for model performance evaluation. The results show that GTC outperforms six other models, including three previously reported machine learning models, on the KEGG dataset. GTC yields an accuracy of 96.75%, precision of 85.14%, recall of 83.03%, and F1_score of 84.06%. Furthermore, an ablation study confirms the indispensability of all the components of the hybrid GTC model. Transfer learning is then employed to leverage the shared knowledge acquired from the KEGG metabolic pathways. As a result, the transferred GTC exhibits outstanding accuracy in predicting plant secondary metabolic pathways with an average accuracy of 98.30% in fivefold cross-validation and 97.82% on the final test. In addition, GTC is employed to classify natural products. It achieves a perfect accuracy score of 100.00% for alkaloids, while the lowest accuracy score of 98.42% for shikimates and phenylpropanoids.

CONCLUSIONS

The proposed GTC effectively captures molecular features, and achieves high performance in classifying KEGG metabolic pathways and predicting plant secondary metabolic pathways via transfer learning. Furthermore, GTC demonstrates its generalization ability by accurately classifying natural products. A user-friendly executable program has been developed, which only requires the input of the SMILES string of the query compound in a graphical interface.

Collapse

Affiliation(s)

Han Bao CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116023, People's Republic of China University of Chinese Academy of Sciences, Beijing, 100049, People's Republic of China Liaoning Province Key Laboratory of Metabolomics, Dalian, 116023, People's Republic of China
Jinhui Zhao CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116023, People's Republic of China University of Chinese Academy of Sciences, Beijing, 100049, People's Republic of China Liaoning Province Key Laboratory of Metabolomics, Dalian, 116023, People's Republic of China
Xinjie Zhao CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116023, People's Republic of China University of Chinese Academy of Sciences, Beijing, 100049, People's Republic of China Liaoning Province Key Laboratory of Metabolomics, Dalian, 116023, People's Republic of China
Chunxia Zhao CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116023, People's Republic of China University of Chinese Academy of Sciences, Beijing, 100049, People's Republic of China Liaoning Province Key Laboratory of Metabolomics, Dalian, 116023, People's Republic of China
Xin Lu CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116023, People's Republic of China. University of Chinese Academy of Sciences, Beijing, 100049, People's Republic of China. Liaoning Province Key Laboratory of Metabolomics, Dalian, 116023, People's Republic of China.
Guowang Xu CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116023, People's Republic of China. University of Chinese Academy of Sciences, Beijing, 100049, People's Republic of China. Liaoning Province Key Laboratory of Metabolomics, Dalian, 116023, People's Republic of China.

Collapse

Tran T, Ekenna C. Molecular Descriptors Property Prediction Using Transformer-Based Approach. Int J Mol Sci 2023;24:11948. [PMID: 37569322 PMCID: PMC10419034 DOI: 10.3390/ijms241511948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 07/21/2023] [Accepted: 07/24/2023] [Indexed: 08/13/2023] Open

Ramírez-Palacios C, Marrink SJ. Super High-Throughput Screening of Enzyme Variants by Spectral Graph Convolutional Neural Networks. J Chem Theory Comput 2023. [PMID: 36961994 PMCID: PMC10373491 DOI: 10.1021/acs.jctc.2c01227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2023]

Baptista D, Ferreira PG, Rocha M. A systematic evaluation of deep learning methods for the prediction of drug synergy in cancer. PLoS Comput Biol 2023;19:e1010200. [PMID: 36952569 PMCID: PMC10072473 DOI: 10.1371/journal.pcbi.1010200] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Revised: 04/04/2023] [Accepted: 02/08/2023] [Indexed: 03/25/2023] Open

Abstract

One of the main obstacles to the successful treatment of cancer is the phenomenon of drug resistance. A common strategy to overcome resistance is the use of combination therapies. However, the space of possibilities is huge and efficient search strategies are required. Machine Learning (ML) can be a useful tool for the discovery of novel, clinically relevant anti-cancer drug combinations. In particular, deep learning (DL) has become a popular choice for modeling drug combination effects. Here, we set out to examine the impact of different methodological choices on the performance of multimodal DL-based drug synergy prediction methods, including the use of different input data types, preprocessing steps and model architectures. Focusing on the NCI ALMANAC dataset, we found that feature selection based on prior biological knowledge has a positive impact-limiting gene expression data to cancer or drug response-specific genes improved performance. Drug features appeared to be more predictive of drug response, with a 41% increase in coefficient of determination (R2) and 26% increase in Spearman correlation relative to a baseline model that used only cell line and drug identifiers. Molecular fingerprint-based drug representations performed slightly better than learned representations-ECFP4 fingerprints increased R2 by 5.3% and Spearman correlation by 2.8% w.r.t the best learned representations. In general, fully connected feature-encoding subnetworks outperformed other architectures. DL outperformed other ML methods by more than 35% (R2) and 14% (Spearman). Additionally, an ensemble combining the top DL and ML models improved performance by about 6.5% (R2) and 4% (Spearman). Using a state-of-the-art interpretability method, we showed that DL models can learn to associate drug and cell line features with drug response in a biologically meaningful way. The strategies explored in this study will help to improve the development of computational methods for the rational design of effective drug combinations for cancer therapy.

Collapse

Accurate predictions of drugs aqueous solubility via deep learning tools. J Mol Struct 2022. [DOI: 10.1016/j.molstruc.2021.131562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Deep Learning-Assisted Repurposing of Plant Compounds for Treating Vascular Calcification: An In Silico Study with Experimental Validation. OXIDATIVE MEDICINE AND CELLULAR LONGEVITY 2022;2022:4378413. [PMID: 35035662 PMCID: PMC8754599 DOI: 10.1155/2022/4378413] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 10/24/2021] [Accepted: 11/13/2021] [Indexed: 12/13/2022]

Abstract

Background

Vascular calcification (VC) constitutes subclinical vascular burden and increases cardiovascular mortality. Effective therapeutics for VC remains to be procured. We aimed to use a deep learning-based strategy to screen and uncover plant compounds that potentially can be repurposed for managing VC.

Methods

We integrated drugome, interactome, and diseasome information from Comparative Toxicogenomic Database (CTD), DrugBank, PubChem, Gene Ontology (GO), and BioGrid to analyze drug-disease associations. A deep representation learning was done using a high-level description of the local network architecture and features of the entities, followed by learning the global embeddings of nodes derived from a heterogeneous network using the graph neural network architecture and a random forest classifier established for prediction. Predicted results were tested in an in vitro VC model for validity based on the probability scores.

Results

We collected 6,790 compounds with available Simplified Molecular-Input Line-Entry System (SMILES) data, 11,958 GO terms, 7,238 diseases, and 25,482 proteins, followed by local embedding vectors using an end-to-end transformer network and a node2vec algorithm and global embedding vectors learned from heterogeneous network via the graph neural network. Our algorithm conferred a good distinction between potential compounds, presenting as higher prediction scores for the compound categories with a higher potential but lower scores for other categories. Probability score-dependent selection revealed that antioxidants such as sulforaphane and daidzein were potentially effective compounds against VC, while catechin had low probability. All three compounds were validated in vitro.

Conclusions

Our findings exemplify the utility of deep learning in identifying promising VC-treating plant compounds. Our model can be a quick and comprehensive computational screening tool to assist in the early drug discovery process.

Collapse

Geometric deep learning on molecular representations. NAT MACH INTELL 2021. [DOI: 10.1038/s42256-021-00418-8] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Walters WP, Wang R. New Trends in Virtual Screening. J Chem Inf Model 2021;60:4109-4111. [PMID: 32981325 DOI: 10.1021/acs.jcim.0c01009] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]

Gao P, Zhang J, Sun Y, Yu J. Accurate predictions of aqueous solubility of drug molecules via the multilevel graph convolutional network (MGCN) and SchNet architectures. Phys Chem Chem Phys 2020;22:23766-23772. [PMID: 33063077 DOI: 10.1039/d0cp03596c] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]