1
|
Kell G, Roberts A, Umansky S, Qian L, Ferrari D, Soboczenski F, Wallace BC, Patel N, Marshall IJ. Question answering systems for health professionals at the point of care-a systematic review. J Am Med Inform Assoc 2024; 31:1009-1024. [PMID: 38366879 PMCID: PMC10990539 DOI: 10.1093/jamia/ocae015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 01/11/2024] [Accepted: 01/15/2024] [Indexed: 02/18/2024] Open
Abstract
OBJECTIVES Question answering (QA) systems have the potential to improve the quality of clinical care by providing health professionals with the latest and most relevant evidence. However, QA systems have not been widely adopted. This systematic review aims to characterize current medical QA systems, assess their suitability for healthcare, and identify areas of improvement. MATERIALS AND METHODS We searched PubMed, IEEE Xplore, ACM Digital Library, ACL Anthology, and forward and backward citations on February 7, 2023. We included peer-reviewed journal and conference papers describing the design and evaluation of biomedical QA systems. Two reviewers screened titles, abstracts, and full-text articles. We conducted a narrative synthesis and risk of bias assessment for each study. We assessed the utility of biomedical QA systems. RESULTS We included 79 studies and identified themes, including question realism, answer reliability, answer utility, clinical specialism, systems, usability, and evaluation methods. Clinicians' questions used to train and evaluate QA systems were restricted to certain sources, types and complexity levels. No system communicated confidence levels in the answers or sources. Many studies suffered from high risks of bias and applicability concerns. Only 8 studies completely satisfied any criterion for clinical utility, and only 7 reported user evaluations. Most systems were built with limited input from clinicians. DISCUSSION While machine learning methods have led to increased accuracy, most studies imperfectly reflected real-world healthcare information needs. Key research priorities include developing more realistic healthcare QA datasets and considering the reliability of answer sources, rather than merely focusing on accuracy.
Collapse
Affiliation(s)
- Gregory Kell
- Department of Population Health Sciences, King’s College London, London, Greater London, SE1 1UL, United Kingdom
| | - Angus Roberts
- Department of Biostatistics and Health Informatics, King’s College London, London, Greater London, SE5 8AB, United Kingdom
| | - Serge Umansky
- Metadvice Ltd, London, Greater London, SW1Y 5JG, United Kingdom
| | - Linglong Qian
- Department of Biostatistics and Health Informatics, King’s College London, London, Greater London, SE5 8AB, United Kingdom
| | - Davide Ferrari
- Department of Population Health Sciences, King’s College London, London, Greater London, SE1 1UL, United Kingdom
| | - Frank Soboczenski
- Department of Population Health Sciences, King’s College London, London, Greater London, SE1 1UL, United Kingdom
| | - Byron C Wallace
- Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, United States
| | - Nikhil Patel
- Department of Population Health Sciences, King’s College London, London, Greater London, SE1 1UL, United Kingdom
| | - Iain J Marshall
- Department of Population Health Sciences, King’s College London, London, Greater London, SE1 1UL, United Kingdom
| |
Collapse
|
2
|
Yang H, Li M, Zhou H, Xiao Y, Fang Q, Zhang R. One LLM is not Enough: Harnessing the Power of Ensemble Learning for Medical Question Answering. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.12.21.23300380. [PMID: 38196648 PMCID: PMC10775333 DOI: 10.1101/2023.12.21.23300380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2024]
Abstract
Objective To enhance the accuracy and reliability of diverse medical question-answering (QA) tasks and investigate efficient approaches deploying the Large Language Models (LLM) technologies, We developed a novel ensemble learning pipeline by utilizing state-of-the-art LLMs, focusing on improving performance on diverse medical QA datasets. Materials and Methods Our study employs three medical QA datasets: PubMedQA, MedQA-USMLE, and MedMCQA, each presenting unique challenges in biomedical question-answering. The proposed LLM-Synergy framework, focusing exclusively on zero-shot cases using LLMs, incorporates two primary ensemble methods. The first is a Boosting-based weighted majority vote ensemble, where decision-making is expedited and refined by assigning variable weights to different LLMs through a boosting algorithm. The second method is Cluster-based Dynamic Model Selection, which dynamically selects the most suitable LLM votes for each query, based on the characteristics of question contexts, using a clustering approach. Results The Majority Weighted Vote and Dynamic Model Selection methods demonstrate superior performance compared to individual LLMs across three medical QA datasets. Specifically, the accuracies are 35.84%, 96.21%, and 37.26% for MedMCQA, PubMedQA, and MedQA-USMLE, respectively, with the Majority Weighted Vote. Correspondingly, the Dynamic Model Selection yields slightly higher accuracies of 38.01%, 96.36%, and 38.13%. Conclusion The LLM-Synergy framework with two ensemble methods, represents a significant advancement in leveraging LLMs for medical QA tasks and provides an innovative way of efficiently utilizing the development with LLM Technologies, customing for both existing and potentially future challenge tasks in biomedical and health informatics research.
Collapse
Affiliation(s)
- Han Yang
- Institute for Health Informatics, University of Minnesota, Minneapolis, Minnesota, USA
| | - Mingchen Li
- Division of Computational Health Sciences, Department of Surgery, University of Minnesota, Minneapolis, MN, USA
| | - Huixue Zhou
- Institute for Health Informatics, University of Minnesota, Minneapolis, Minnesota, USA
| | - Yongkang Xiao
- Institute for Health Informatics, University of Minnesota, Minneapolis, Minnesota, USA
| | - Qian Fang
- H. Milton Stewart School of Industrial & Systems Engineering, Georgia Institute of Technology, Atlanta, GA, USA
| | - Rui Zhang
- Division of Computational Health Sciences, Department of Surgery, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
3
|
Srinivas K, Gagana Sri R, Pravallika K, Nishitha K, Polamuri SR. COVID-19 prediction based on hybrid Inception V3 with VGG16 using chest X-ray images. MULTIMEDIA TOOLS AND APPLICATIONS 2023:1-18. [PMID: 37362699 PMCID: PMC10240113 DOI: 10.1007/s11042-023-15903-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 05/12/2023] [Accepted: 05/18/2023] [Indexed: 06/28/2023]
Abstract
The Corona Virus was first started in the Wuhan city, China in December of 2019. It belongs to the Coronaviridae family, which can infect both animals and humans. The diagnosis of coronavirus disease-2019 (COVID-19) is typically detected by Serology, Genetic Real-Time reverse transcription-Polymerase Chain Reaction (RT-PCR), and Antigen testing. These testing methods have limitations like limited sensitivity, high cost, and long turn-around time. It is necessary to develop an automatic detection system for COVID-19 prediction. Chest X-ray is a lower-cost process in comparison to chest Computed tomography (CT). Deep learning is the best fruitful technique of machine learning, which provides useful investigation for learning and screening a large amount of chest X-ray images with COVID-19 and normal. There are many deep learning methods for prediction, but these methods have a few limitations like overfitting, misclassification, and false predictions for poor-quality chest X-rays. In order to overcome these limitations, the novel hybrid model called "Inception V3 with VGG16 (Visual Geometry Group)" is proposed for the prediction of COVID-19 using chest X-rays. It is a combination of two deep learning models, Inception V3 and VGG16 (IV3-VGG). To build the hybrid model, collected 243 images from the COVID-19 Radiography Database. Out of 243 X-rays, 121 are COVID-19 positive and 122 are normal images. The hybrid model is divided into two modules namely pre-processing and the IV3-VGG. In the dataset, some of the images with different sizes and different color intensities are identified and pre-processed. The second module i.e., IV3-VGG consists of four blocks. The first block is considered for VGG-16 and blocks 2 and 3 are considered for Inception V3 networks and final block 4 consists of four layers namely Avg pooling, dropout, fully connected, and Softmax layers. The experimental results show that the IV3-VGG model achieves the highest accuracy of 98% compared to the existing five prominent deep learning models such as Inception V3, VGG16, ResNet50, DenseNet121, and MobileNet.
Collapse
Affiliation(s)
- K. Srinivas
- Department of CSE, VR Siddhartha Engineering College, Vijayawada, 520007 India
| | - R. Gagana Sri
- Department of CSE, VR Siddhartha Engineering College, Vijayawada, 520007 India
| | - K. Pravallika
- Department of CSE, Sir C. R. Reddy Engineering College, Eluru, 534007 India
| | - K. Nishitha
- Department of CSE, VR Siddhartha Engineering College, Vijayawada, 520007 India
| | - Subba Rao Polamuri
- Department of CSE, Bonam Venkata Chalamayya Engineering College (Autonomous), Odalarevu, 533210 India
| |
Collapse
|
4
|
Kuo TT, Pham A, Edelson ME, Kim J, Chan J, Gupta Y, Ohno-Machado L. Blockchain-enabled immutable, distributed, and highly available clinical research activity logging system for federated COVID-19 data analysis from multiple institutions. J Am Med Inform Assoc 2023; 30:1167-1178. [PMID: 36916740 PMCID: PMC10198529 DOI: 10.1093/jamia/ocad049] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 03/07/2023] [Accepted: 03/11/2023] [Indexed: 03/15/2023] Open
Abstract
OBJECTIVE We aimed to develop a distributed, immutable, and highly available cross-cloud blockchain system to facilitate federated data analysis activities among multiple institutions. MATERIALS AND METHODS We preprocessed 9166 COVID-19 Structured Query Language (SQL) code, summary statistics, and user activity logs, from the GitHub repository of the Reliable Response Data Discovery for COVID-19 (R2D2) Consortium. The repository collected local summary statistics from participating institutions and aggregated the global result to a COVID-19-related clinical query, previously posted by clinicians on a website. We developed both on-chain and off-chain components to store/query these activity logs and their associated queries/results on a blockchain for immutability, transparency, and high availability of research communication. We measured run-time efficiency of contract deployment, network transactions, and confirmed the accuracy of recorded logs compared to a centralized baseline solution. RESULTS The smart contract deployment took 4.5 s on an average. The time to record an activity log on blockchain was slightly over 2 s, versus 5-9 s for baseline. For querying, each query took on an average less than 0.4 s on blockchain, versus around 2.1 s for baseline. DISCUSSION The low deployment, recording, and querying times confirm the feasibility of our cross-cloud, blockchain-based federated data analysis system. We have yet to evaluate the system on a larger network with multiple nodes per cloud, to consider how to accommodate a surge in activities, and to investigate methods to lower querying time as the blockchain grows. CONCLUSION Blockchain technology can be used to support federated data analysis among multiple institutions.
Collapse
Affiliation(s)
- Tsung-Ting Kuo
- UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California, USA
| | - Anh Pham
- UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California, USA
| | - Maxim E Edelson
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, California, USA
| | - Jihoon Kim
- UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California, USA
| | - Jason Chan
- Poway High School, Poway, California, USA
| | - Yash Gupta
- Canyon Crest Academy, San Diego, California, USA
| | - Lucila Ohno-Machado
- UCSD Health Department of Biomedical Informatics, University of California San Diego, La Jolla, California, USA
- Division of Health Services Research & Development, VA San Diego Healthcare System, San Diego, California, USA
- Biomedical Informatics and Data Science, Yale School of Medicine, New Haven, Connecticut, USA
| |
Collapse
|
5
|
Joint modeling method of question intent detection and slot filling for domain-oriented question answering system. DATA TECHNOLOGIES AND APPLICATIONS 2023. [DOI: 10.1108/dta-07-2022-0281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]
Abstract
PurposeIntent detection and slot filling are two important tasks in question comprehension of a question answering system. This study aims to build a joint task model with some generalization ability and benchmark its performance over other neural network models mentioned in this paper.Design/methodology/approachThis study used a deep-learning-based approach for the joint modeling of question intent detection and slot filling. Meanwhile, the internal cell structure of the long short-term memory (LSTM) network was improved. Furthermore, the dataset Computer Science Literature Question (CSLQ) was constructed based on the Science and Technology Knowledge Graph. The datasets Airline Travel Information Systems, Snips (a natural language processing dataset of the consumer intent engine collected by Snips) and CSLQ were used for the empirical analysis. The accuracy of intent detection and F1 score of slot filling, as well as the semantic accuracy of sentences, were compared for several models.FindingsThe results showed that the proposed model outperformed all other benchmark methods, especially for the CSLQ dataset. This proves that the design of this study improved the comprehensive performance and generalization ability of the model to some extent.Originality/valueThis study contributes to the understanding of question sentences in a specific domain. LSTM was improved, and a computer literature domain dataset was constructed herein. This will lay the data and model foundation for the future construction of a computer literature question answering system.
Collapse
|
6
|
Van Nguyen K, Do PNT, Nguyen ND, Nguyen AGT, Nguyen NLT. Multi-stage transfer learning with BERTology-based language models for question answering system in vietnamese. INT J MACH LEARN CYB 2023. [DOI: 10.1007/s13042-022-01735-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
|
7
|
COVID-19 detection and classification for machine learning methods using human genomic data. MEASUREMENT: SENSORS 2022; 24:100537. [PMCID: PMC9595328 DOI: 10.1016/j.measen.2022.100537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 10/12/2022] [Accepted: 10/18/2022] [Indexed: 11/06/2022]
Abstract
Coronavirus is a disease connected to coronavirus. World Health Organization has declared COVID-19 a pandemic. It has an impact on 212 nations and territories worldwide. Examining and identifying patterns in X-Ray pictures of the lungs is still necessary. Early diagnosis may help to lessen a person's virus exposure and prevent it. Manual diagnosis is a time- and labor-intensive process. Since the COVID-19 virus has the potential to infect individuals all around the world, its finding is extremely concerning. The purpose of this study is to apply machine learning to identify and classify coronaviruses. The COVID-19 is anticipated to be discriminated and categorized in CT-Lung screening and computer-aided diagnosis (CAD). Several machine learning methods, including Decision Tree, Support Vector Machine, K-means clustering, and Radial Basis Function, were utilised in conjunction with clinical samples from patients who had contracted corona. While some medical professionals think an RT-PCR test is the most reliable and economical way to detect Covid-19 patients, others think a lung CT scan is more precise and less expensive. Serum samples, respiratory secretions, and whole blood samples are examples of clinical specimens. As a result of the earlier clinical evaluations, these tissues are used to assess 15 different parameters. As part of the proposed four-phase CAD system, the CT lungs screening collection is followed by a pre-processing step that enhances the appearance of ground-glass opacities (GGOs) nodules, which are initially extremely fuzzy and poorly contrasting due to the absence of contrast. These zones will be found and segmented using a modified K-means technique. Support vector machines (SVM) and radial basis functions (RBF) will be used as the input and target data for machine learning classifiers with a 50x50 pixel resolution to categorise the contaminated zones found during the detection phase (RBF). The 15 input items gathered from clinical specimens may be entered into a graphical user interface (GUI) tool that has been created to help doctors receive accurate findings.
Collapse
|
8
|
Qiu D, Yu Y, Chen L. Emotion Analysis of COVID-19 Vaccines Based on a Fuzzy Convolutional Neural Network. Cognit Comput 2022:1-15. [PMID: 36406893 PMCID: PMC9666947 DOI: 10.1007/s12559-022-10068-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Accepted: 10/16/2022] [Indexed: 11/17/2022]
Abstract
COVID-19 created immense global challenges in 2020, and the world will live under its threat indefinitely. Much of the information on social media supported the government in addressing this major public health event. On January 9, to control the virus, the Chinese government announced universal vaccinations. However, due to a range of varied interpretations, people held different attitudes towards vaccination. Therefore, the success of the mass immunization strategy greatly depended on the public perception of the COVID-19 vaccine. This article explores the changes in people's emotional attitudes towards vaccines and the reasons behind them in the context of the global pandemic in an effort to help mankind overcome this ongoing crisis. For this article, microblogs from January to September containing Chinese people's responses to the COVID-19 vaccines were collected. Based on fuzzy logic and deep learning, we advance the hypothesis that fuzzy vector adaptive improvements will make it possible to better express language emotion and that fuzzy emotion vectors can be integrated into deep learning models, thus making these models more interpretable. Based on this assumption, we design a deep learning model with a fuzzy emotion vector. The experimental results show the positive effect of this model. By applying the model in analyses of people's attitudes towards vaccines, we can obtain people's attitudes towards vaccines in different time periods. We discovered that the most negative emotions about the vaccine appeared in April and that the most positive emotions about the vaccine appeared in February. Combined with word cloud technology and the LDA model, we can effectively explore the reasons for the changes in vaccine attitudes. Our findings show that people's negative emotions about the vaccine are always higher than their positive emotions about the vaccine and that people's attitudes towards the vaccine are closely related to the progress of the epidemic. There is also a certain relationship between people's attitudes towards the vaccine and those towards the vaccination.
Collapse
Affiliation(s)
- Dong Qiu
- School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Nanan, Chongqing 400065 China
- College of Science, Chongqing University of Posts and Telecommunications, Nanan, Chongqing 400065 China
- School of Mathematics and Information Science, Guangxi University, Nanning, China
| | - Yang Yu
- School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Nanan, Chongqing 400065 China
| | - Lei Chen
- School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Nanan, Chongqing 400065 China
| |
Collapse
|
9
|
Raza S, Schwartz B, Rosella LC. CoQUAD: a COVID-19 question answering dataset system, facilitating research, benchmarking, and practice. BMC Bioinformatics 2022; 23:210. [PMID: 35655148 PMCID: PMC9160513 DOI: 10.1186/s12859-022-04751-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Accepted: 05/26/2022] [Indexed: 12/24/2022] Open
Abstract
Background Due to the growing amount of COVID-19 research literature, medical experts, clinical scientists, and researchers frequently struggle to stay up to date on the most recent findings. There is a pressing need to assist researchers and practitioners in mining and responding to COVID-19-related questions on time. Methods This paper introduces CoQUAD, a question-answering system that can extract answers related to COVID-19 questions in an efficient manner. There are two datasets provided in this work: a reference-standard dataset built using the CORD-19 and LitCOVID initiatives, and a gold-standard dataset prepared by the experts from a public health domain. The CoQUAD has a Retriever component trained on the BM25 algorithm that searches the reference-standard dataset for relevant documents based on a question related to COVID-19. CoQUAD also has a Reader component that consists of a Transformer-based model, namely MPNet, which is used to read the paragraphs and find the answers related to a question from the retrieved documents. In comparison to previous works, the proposed CoQUAD system can answer questions related to early, mid, and post-COVID-19 topics. Results Extensive experiments on CoQUAD Retriever and Reader modules show that CoQUAD can provide effective and relevant answers to any COVID-19-related questions posed in natural language, with a higher level of accuracy. When compared to state-of-the-art baselines, CoQUAD outperforms the previous models, achieving an exact match ratio score of 77.50% and an F1 score of 77.10%. Conclusion CoQUAD is a question-answering system that mines COVID-19 literature using natural language processing techniques to help the research community find the most recent findings and answer any related questions. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04751-6.
Collapse
|
10
|
Wang H, Du H, Qi G, Chen H, Hu W, Chen Z. Construction of A Linked Dataset of COVID-19 Knowledge Graphs: Development and Applications. JMIR Med Inform 2022; 10:e37215. [PMID: 35476822 PMCID: PMC9109781 DOI: 10.2196/37215] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Revised: 04/23/2022] [Accepted: 04/26/2022] [Indexed: 01/22/2023] Open
Abstract
Background With the continuous spread of COVID-19, information about the worldwide pandemic is exploding. Therefore, it is necessary and significant to organize such a large amount of information. As the key branch of artificial intelligence, a knowledge graph (KG) is helpful to structure, reason, and understand data. Objective To improve the utilization value of the information and effectively aid researchers to combat COVID-19, we have constructed and successively released a unified linked data set named OpenKG-COVID19, which is one of the largest existing KGs related to COVID-19. OpenKG-COVID19 includes 10 interlinked COVID-19 subgraphs covering the topics of encyclopedia, concept, medical, research, event, health, epidemiology, goods, prevention, and character. Methods In this paper, we introduce the key techniques exploited in building COVID-19 KGs in a top-down manner. First, the schema of the modeling process for each KG in OpenKG-COVID19 is described. Second, we propose different methods for extracting knowledge from open government sites, professional texts, public domain–specific sources, and public encyclopedia sites. The curated 10 COVID-19 KGs are further linked together at both the schema and data levels. In addition, we present the naming convention for OpenKG-COVID19. Results OpenKG-COVID19 has more than 2572 concepts, 329,600 entities, 513 properties, and 2,687,329 facts, and the data set will be updated continuously. Each COVID-19 KG was evaluated, and the average precision was found to be above 93%. We have developed search and browse interfaces and a SPARQL endpoint to improve user access. Possible intelligent applications based on OpenKG-COVID19 for further development are also described. Conclusions A KG is useful for intelligent question-answering, semantic searches, recommendation systems, visualization analysis, and decision-making support. Research related to COVID-19, biomedicine, and many other communities can benefit from OpenKG-COVID19. Furthermore, the 10 KGs will be continuously updated to ensure that the public will have access to sufficient and up-to-date knowledge.
Collapse
Affiliation(s)
- Haofen Wang
- College of Design and Innovation, Tongji University, No. 281 Fuxin Road, Yangpu District, Shanghai, CN
| | - Huifang Du
- College of Design and Innovation, Tongji University, No. 281 Fuxin Road, Yangpu District, Shanghai, CN
| | - Guilin Qi
- School of Computer Science and Engineering, Southeast University, Nanjing, CN
| | - Huajun Chen
- College of Computer Science and Technology, Zhejiang University, Hangzhou, CN
| | - Wei Hu
- National Institute of Healthcare Data Science, Nanjing University, Nanjing, CN
| | - Zhuo Chen
- College of Computer Science and Technology, Zhejiang University, Hangzhou, CN
| |
Collapse
|
11
|
BERT-Based Transfer-Learning Approach for Nested Named-Entity Recognition Using Joint Labeling. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12030976] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Named-entity recognition (NER) is one of the primary components in various natural language processing tasks such as relation extraction, information retrieval, question answering, etc. The majority of the research work deals with flat entities. However, it was observed that the entities were often embedded within other entities. Most of the current state-of-the-art models deal with the problem of embedded/nested entity recognition with very complex neural network architectures. In this research work, we proposed to solve the problem of nested named-entity recognition using the transfer-learning approach. For this purpose, different variants of fine-tuned, pretrained, BERT-based language models were used for the problem using the joint-labeling modeling technique. Two nested named-entity-recognition datasets, i.e., GENIA and GermEval 2014, were used for the experiment, with four and two levels of annotation, respectively. Also, the experiments were performed on the JNLPBA dataset, which has flat annotation. The performance of the above models was measured using F1-score metrics, commonly used as the standard metrics to evaluate the performance of named-entity-recognition models. In addition, the performance of the proposed approach was compared with the conditional random field and the Bi-LSTM-CRF model. It was found that the fine-tuned, pretrained, BERT-based models outperformed the other models significantly without requiring any external resources or feature extraction. The results of the proposed models were compared with various other existing approaches. The best-performing BERT-based model achieved F1-scores of 74.38, 85.29, and 80.68 for the GENIA, GermEval 2014, and JNLPBA datasets, respectively. It was found that the transfer learning (i.e., pretrained BERT models after fine-tuning) based approach for the nested named-entity-recognition task could perform well and is a more generalized approach in comparison to many of the existing approaches.
Collapse
|