1
|
Wang Y, Wang B, Zou J, Wu A, Liu Y, Wan Y, Luo J, Wu J. Capsule neural network and its applications in drug discovery. iScience 2025; 28:112217. [PMID: 40241764 PMCID: PMC12002614 DOI: 10.1016/j.isci.2025.112217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/18/2025] Open
Abstract
Deep learning holds great promise in drug discovery, yet its application is hindered by high labeling costs and limited datasets. Developing algorithms that effectively learn from sparsely labeled data is crucial. Capsule networks (CapsNet), introduced in 2017, solve the spatial information loss in traditional neural networks and excel in handling small datasets by capturing spatial hierarchical relationships among features. This capability makes CapsNet particularly promising for drug discovery, where data scarcity is a common challenge. Various modified CapsNet architectures have been successfully applied to drug design and discovery tasks. This review provides a comprehensive analysis of CapsNet's theoretical foundations, its current applications in drug discovery, and its performance in addressing key challenges in the field. Additionally, the study highlights the limitations of CapsNet and outlines potential future research directions to further enhance its utility in drug discovery, offering valuable insights for researchers in both computational and pharmaceutical sciences.
Collapse
Affiliation(s)
- Yiwei Wang
- School of Basic Medical Sciences, Southwest Medical University, Luzhou 646000, China
- Key Laboratory of Medical Electrophysiology, Ministry of Education & Medical Electrophysiological Key Laboratory of Sichuan Province, Institute of Cardiovascular Research, Southwest Medical University, Luzhou 646000, China
| | - Binyou Wang
- School of Basic Medical Sciences, Southwest Medical University, Luzhou 646000, China
| | - Jun Zou
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Anguo Wu
- Sichuan Key Medical Laboratory of New Drug Discovery and Druggability Evaluation, Luzhou Key Laboratory of Activity Screening and Druggability Evaluation for Chinese Materia Medica, School of Pharmacy, Southwest Medical University, Luzhou 646000, China
| | - Yuan Liu
- School of Basic Medical Sciences, Southwest Medical University, Luzhou 646000, China
| | - Ying Wan
- School of Basic Medical Sciences, Southwest Medical University, Luzhou 646000, China
| | - Jiesi Luo
- School of Basic Medical Sciences, Southwest Medical University, Luzhou 646000, China
| | - Jianming Wu
- School of Basic Medical Sciences, Southwest Medical University, Luzhou 646000, China
- Key Laboratory of Medical Electrophysiology, Ministry of Education & Medical Electrophysiological Key Laboratory of Sichuan Province, Institute of Cardiovascular Research, Southwest Medical University, Luzhou 646000, China
- Sichuan Key Medical Laboratory of New Drug Discovery and Druggability Evaluation, Luzhou Key Laboratory of Activity Screening and Druggability Evaluation for Chinese Materia Medica, School of Pharmacy, Southwest Medical University, Luzhou 646000, China
| |
Collapse
|
2
|
He J, Li F, Li J, Hu X, Nian Y, Xiang Y, Wang J, Wei Q, Li Y, Xu H, Tao C. Prompt Tuning in Biomedical Relation Extraction. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2024; 8:206-224. [PMID: 38681754 PMCID: PMC11052745 DOI: 10.1007/s41666-024-00162-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Revised: 02/09/2024] [Accepted: 02/19/2024] [Indexed: 05/01/2024]
Abstract
Biomedical relation extraction (RE) is critical in constructing high-quality knowledge graphs and databases as well as supporting many downstream text mining applications. This paper explores prompt tuning on biomedical RE and its few-shot scenarios, aiming to propose a simple yet effective model for this specific task. Prompt tuning reformulates natural language processing (NLP) downstream tasks into masked language problems by embedding specific text prompts into the original input, facilitating the adaption of pre-trained language models (PLMs) to better address these tasks. This study presents a customized prompt tuning model designed explicitly for biomedical RE, including its applicability in few-shot learning contexts. The model's performance was rigorously assessed using the chemical-protein relation (CHEMPROT) dataset from BioCreative VI and the drug-drug interaction (DDI) dataset from SemEval-2013, showcasing its superior performance over conventional fine-tuned PLMs across both datasets, encompassing few-shot scenarios. This observation underscores the effectiveness of prompt tuning in enhancing the capabilities of conventional PLMs, though the extent of enhancement may vary by specific model. Additionally, the model demonstrated a harmonious balance between simplicity and efficiency, matching state-of-the-art performance without needing external knowledge or extra computational resources. The pivotal contribution of our study is the development of a suitably designed prompt tuning model, highlighting prompt tuning's effectiveness in biomedical RE. It offers a robust, efficient approach to the field's challenges and represents a significant advancement in extracting complex relations from biomedical texts. Supplementary Information The online version contains supplementary material available at 10.1007/s41666-024-00162-9.
Collapse
Affiliation(s)
- Jianping He
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX USA
| | - Fang Li
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX USA
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Jacksonville, FL USA
| | - Jianfu Li
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX USA
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Jacksonville, FL USA
| | - Xinyue Hu
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX USA
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Jacksonville, FL USA
| | - Yi Nian
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX USA
| | - Yang Xiang
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX USA
| | - Jingqi Wang
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX USA
| | - Qiang Wei
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX USA
| | - Yiming Li
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX USA
| | - Hua Xu
- Department of Bioinformatics and Data Science, Yale School of Medicine, New Haven, CT USA
| | - Cui Tao
- McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX USA
- Department of Artificial Intelligence and Informatics, Mayo Clinic, Jacksonville, FL USA
| |
Collapse
|
3
|
Guo S, Yang W, Han L, Song X, Wang G. A multi-layer soft lattice based model for Chinese clinical named entity recognition. BMC Med Inform Decis Mak 2022; 22:201. [PMID: 35908055 PMCID: PMC9338545 DOI: 10.1186/s12911-022-01924-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2022] [Accepted: 07/04/2022] [Indexed: 11/13/2022] Open
Abstract
Objective Named entity recognition (NER) is a key and fundamental part of many medical and clinical tasks, including the establishment of a medical knowledge graph, decision-making support, and question answering systems. When extracting entities from electronic health records (EHRs), NER models mostly apply long short-term memory (LSTM) and have surprising performance in clinical NER. However, increasing the depth of the network is often required by these LSTM-based models to capture long-distance dependencies. Therefore, these LSTM-based models that have achieved high accuracy generally require long training times and extensive training data, which has obstructed the adoption of LSTM-based models in clinical scenarios with limited training time. Method Inspired by Transformer, we combine Transformer with Soft Term Position Lattice to form soft lattice structure Transformer, which models long-distance dependencies similarly to LSTM. Our model consists of four components: the WordPiece module, the BERT module, the soft lattice structure Transformer module, and the CRF module. Result Our experiments demonstrated that this approach increased the F1 by 1–5% in the CCKS NER task compared to other models based on LSTM with CRF and consumed less training time. Additional evaluations showed that lattice structure transformer shows good performance for recognizing long medical terms, abbreviations, and numbers. The proposed model achieve 91.6% f-measure in recognizing long medical terms and 90.36% f-measure in abbreviations, and numbers. Conclusions By using soft lattice structure Transformer, the method proposed in this paper captured Chinese words to lattice information, making our model suitable for Chinese clinical medical records. Transformers with Mutilayer soft lattice Chinese word construction can capture potential interactions between Chinese characters and words.
Collapse
Affiliation(s)
- Shuli Guo
- State Key Laboratory of Intelligent Control and Decision of Complex Systems, School of Automation, Beijing Institute of Technology, Beijing, China
| | - Wentao Yang
- State Key Laboratory of Intelligent Control and Decision of Complex Systems, School of Automation, Beijing Institute of Technology, Beijing, China
| | - Lina Han
- Department of Cardiology, The Second Medical Center, National Clinical Research Center for Geriatric Diseases, Chinese PLA General Hospital, Beijing, China.
| | - Xiaowei Song
- State Key Laboratory of Intelligent Control and Decision of Complex Systems, School of Automation, Beijing Institute of Technology, Beijing, China
| | - Guowei Wang
- State Key Laboratory of Intelligent Control and Decision of Complex Systems, School of Automation, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
4
|
Manifold biomedical text sentence embedding. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.04.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
5
|
Zhao D, Wang J, Zhang Y, Wang X, Lin H, Yang Z. Incorporating representation learning and multihead attention to improve biomedical cross-sentence n-ary relation extraction. BMC Bioinformatics 2020; 21:312. [PMID: 32677883 PMCID: PMC7364499 DOI: 10.1186/s12859-020-03629-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2019] [Accepted: 06/23/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Most biomedical information extraction focuses on binary relations within single sentences. However, extracting n-ary relations that span multiple sentences is in huge demand. At present, in the cross-sentence n-ary relation extraction task, the mainstream method not only relies heavily on syntactic parsing but also ignores prior knowledge. RESULTS In this paper, we propose a novel cross-sentence n-ary relation extraction method that utilizes the multihead attention and knowledge representation that is learned from the knowledge graph. Our model is built on self-attention, which can directly capture the relations between two words regardless of their syntactic relation. In addition, our method makes use of entity and relation information from the knowledge base to impose assistance while predicting the relation. Experiments on n-ary relation extraction show that combining context and knowledge representations can significantly improve the n-ary relation extraction performance. Meanwhile, we achieve comparable results with state-of-the-art methods. CONCLUSIONS We explored a novel method for cross-sentence n-ary relation extraction. Unlike previous approaches, our methods operate directly on the sequence and learn how to model the internal structures of sentences. In addition, we introduce the knowledge representations learned from the knowledge graph into the cross-sentence n-ary relation extraction. Experiments based on knowledge representation learning show that entities and relations can be extracted in the knowledge graph, and coding this knowledge can provide consistent benefits.
Collapse
Affiliation(s)
- Di Zhao
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Jian Wang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China.
| | - Yijia Zhang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Xin Wang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Hongfei Lin
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Zhihao Yang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, China
| |
Collapse
|
6
|
Sun C, Yang Z, Su L, Wang L, Zhang Y, Lin H, Wang J. Chemical–protein interaction extraction via Gaussian probability distribution and external biomedical knowledge. Bioinformatics 2020; 36:4323-4330. [DOI: 10.1093/bioinformatics/btaa491] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Revised: 04/22/2020] [Accepted: 05/06/2020] [Indexed: 01/25/2023] Open
Abstract
Abstract
Motivation
The biomedical literature contains a wealth of chemical–protein interactions (CPIs). Automatically extracting CPIs described in biomedical literature is essential for drug discovery, precision medicine, as well as basic biomedical research. Most existing methods focus only on the sentence sequence to identify these CPIs. However, the local structure of sentences and external biomedical knowledge also contain valuable information. Effective use of such information may improve the performance of CPI extraction.
Results
In this article, we propose a novel neural network-based approach to improve CPI extraction. Specifically, the approach first employs BERT to generate high-quality contextual representations of the title sequence, instance sequence and knowledge sequence. Then, the Gaussian probability distribution is introduced to capture the local structure of the instance. Meanwhile, the attention mechanism is applied to fuse the title information and biomedical knowledge, respectively. Finally, the related representations are concatenated and fed into the softmax function to extract CPIs. We evaluate our proposed model on the CHEMPROT corpus. Our proposed model is superior in performance as compared with other state-of-the-art models. The experimental results show that the Gaussian probability distribution and external knowledge are complementary to each other. Integrating them can effectively improve the CPI extraction performance. Furthermore, the Gaussian probability distribution can effectively improve the extraction performance of sentences with overlapping relations in biomedical relation extraction tasks.
Availability and implementation
Data and code are available at https://github.com/CongSun-dlut/CPI_extraction.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Cong Sun
- School of Computer Science and Technology
| | | | - Leilei Su
- School of Mathematical Sciences, Dalian University of Technology, Dalian 116024, China
| | - Lei Wang
- Beijing Institute of Health Administration and Medical Information, Beijing 100850, China
| | - Yin Zhang
- Beijing Institute of Health Administration and Medical Information, Beijing 100850, China
| | | | - Jian Wang
- School of Computer Science and Technology
| |
Collapse
|
7
|
Li F, Zhu F, Ling X, Liu Q. Protein Interaction Network Reconstruction Through Ensemble Deep Learning With Attention Mechanism. Front Bioeng Biotechnol 2020; 8:390. [PMID: 32432096 PMCID: PMC7215070 DOI: 10.3389/fbioe.2020.00390] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Accepted: 04/07/2020] [Indexed: 11/13/2022] Open
Abstract
Protein interactions play an essential role in studying living systems and life phenomena. A considerable amount of literature has been published on analyzing and predicting protein interactions, such as support vector machine method, homology-based method and similarity-based method, each has its pros and cons. Most existing methods for predicting protein interactions require prior domain knowledge, making it difficult to effectively extract protein features. Single method is dissatisfactory in predicting protein interactions, declaring the need for a comprehensive method that combines the advantages of various methods. On this basis, a deep ensemble learning method called EnAmDNN (Ensemble Deep Neural Networks with Attention Mechanism) is proposed to predict protein interactions which is an appropriate candidate for comprehensive learning, combining multiple models, and considering the advantages of various methods. Particularly, it encode protein sequences by the local descriptor, auto covariance, conjoint triad, pseudo amino acid composition and combine the vector representation of each protein in the protein interaction network. Then it takes advantage of the multi-layer convolutional neural networks to automatically extract protein features and construct an attention mechanism to analyze deep-seated relationships between proteins. We set up four different structures of deep learning models. In the ensemble learning model, second layer data sets are generated with five-fold cross validation from basic learners, then predict the protein interaction network by combining 16 models. Results on five independent PPI data sets demonstrate that EnAmDNN achieves superior prediction performance than other comparing methods.
Collapse
Affiliation(s)
- Feifei Li
- School of Computer Science and Technology, Soochow University, Suzhou, China
| | - Fei Zhu
- School of Computer Science and Technology, Soochow University, Suzhou, China.,Provincial Key Laboratory for Computer Information Processing Technology, Soochow University, Suzhou, China
| | - Xinghong Ling
- School of Computer Science and Technology, Soochow University, Suzhou, China
| | - Quan Liu
- School of Computer Science and Technology, Soochow University, Suzhou, China
| |
Collapse
|
8
|
Sun C, Yang Z, Wang L, Zhang Y, Lin H, Wang J. Attention guided capsule networks for chemical-protein interaction extraction. J Biomed Inform 2020; 103:103392. [PMID: 32068034 DOI: 10.1016/j.jbi.2020.103392] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Revised: 02/08/2020] [Accepted: 02/11/2020] [Indexed: 11/19/2022]
Abstract
The biomedical literature contains a sufficient number of chemical-protein interactions (CPIs). Automatic extraction of CPI is a crucial task in the biomedical domain, which has excellent benefits for precision medicine, drug discovery and basic biomedical research. In this study, we propose a novel model, BERT-based attention-guided capsule networks (BERT-Att-Capsule), for CPI extraction. Specifically, the approach first employs BERT (Bidirectional Encoder Representations from Transformers) to capture the long-range dependencies and bidirectional contextual information of input tokens. Then, the aggregation is regarded as a routing problem for how to pass messages from source capsule nodes to target capsule nodes. This process enables capsule networks to determine what and how much information need to be transferred, as well as to identify sophisticated and interleaved features. Afterwards, the multi-head attention is applied to guide the model to learn different contribution weights of capsule networks obtained by the dynamic routing. We evaluate our model on the CHEMPROT corpus. Our approach is superior in performance as compared with other state-of-the-art methods. Experimental results show that our approach can adequately capture the long-range dependencies and bidirectional contextual information of input tokens, obtain more fine-grained aggregation information through attention-guided capsule networks, and therefore improve the performance.
Collapse
Affiliation(s)
- Cong Sun
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Zhihao Yang
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China.
| | - Lei Wang
- Beijing Institute of Health Administration and Medical Information, Beijing 100850, China.
| | - Yin Zhang
- Beijing Institute of Health Administration and Medical Information, Beijing 100850, China
| | - Hongfei Lin
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Jian Wang
- School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
| |
Collapse
|
9
|
Antunes R, Matos S. Extraction of chemical-protein interactions from the literature using neural networks and narrow instance representation. Database (Oxford) 2019; 2019:baz095. [PMID: 31622463 PMCID: PMC6796919 DOI: 10.1093/database/baz095] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Revised: 06/28/2019] [Accepted: 07/01/2019] [Indexed: 01/21/2023]
Abstract
The scientific literature contains large amounts of information on genes, proteins, chemicals and their interactions. Extraction and integration of this information in curated knowledge bases help researchers support their experimental results, leading to new hypotheses and discoveries. This is especially relevant for precision medicine, which aims to understand the individual variability across patient groups in order to select the most appropriate treatments. Methods for improved retrieval and automatic relation extraction from biomedical literature are therefore required for collecting structured information from the growing number of published works. In this paper, we follow a deep learning approach for extracting mentions of chemical-protein interactions from biomedical articles, based on various enhancements over our participation in the BioCreative VI CHEMPROT task. A significant aspect of our best method is the use of a simple deep learning model together with a very narrow representation of the relation instances, using only up to 10 words from the shortest dependency path and the respective dependency edges. Bidirectional long short-term memory recurrent networks or convolutional neural networks are used to build the deep learning models. We report the results of several experiments and show that our best model is competitive with more complex sentence representations or network structures, achieving an F1-score of 0.6306 on the test set. The source code of our work, along with detailed statistics, is publicly available.
Collapse
Affiliation(s)
- Rui Antunes
- Department of Electronics, Telecommunications and Informatics (DETI), Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, Aveiro, Portugal
| | - Sérgio Matos
- Department of Electronics, Telecommunications and Informatics (DETI), Institute of Electronics and Informatics Engineering of Aveiro (IEETA), University of Aveiro, Aveiro, Portugal
| |
Collapse
|