Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Tawfik NS, Spruit MR. Evaluating sentence representations for biomedical text: Methods and experimental results. J Biomed Inform 2020;104:103396. [PMID: 32147441 DOI: 10.1016/j.jbi.2020.103396] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Revised: 01/22/2020] [Accepted: 02/24/2020] [Indexed: 10/24/2022]

For:	Tawfik NS, Spruit MR. Evaluating sentence representations for biomedical text: Methods and experimental results. J Biomed Inform 2020;104:103396. [PMID: 32147441 DOI: 10.1016/j.jbi.2020.103396] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Revised: 01/22/2020] [Accepted: 02/24/2020] [Indexed: 10/24/2022]

Number

Cited by Other Article(s)

Cho HN, Jun TJ, Kim YH, Kang H, Ahn I, Gwon H, Kim Y, Seo J, Choi H, Kim M, Han J, Kee G, Park S, Ko S. Task-Specific Transformer-Based Language Models in Health Care: Scoping Review. JMIR Med Inform 2024;12:e49724. [PMID: 39556827 PMCID: PMC11612605 DOI: 10.2196/49724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 07/10/2023] [Accepted: 10/21/2024] [Indexed: 11/20/2024] Open

Abstract

BACKGROUND

Transformer-based language models have shown great potential to revolutionize health care by advancing clinical decision support, patient interaction, and disease prediction. However, despite their rapid development, the implementation of transformer-based language models in health care settings remains limited. This is partly due to the lack of a comprehensive review, which hinders a systematic understanding of their applications and limitations. Without clear guidelines and consolidated information, both researchers and physicians face difficulties in using these models effectively, resulting in inefficient research efforts and slow integration into clinical workflows.

OBJECTIVE

This scoping review addresses this gap by examining studies on medical transformer-based language models and categorizing them into 6 tasks: dialogue generation, question answering, summarization, text classification, sentiment analysis, and named entity recognition.

METHODS

We conducted a scoping review following the Cochrane scoping review protocol. A comprehensive literature search was performed across databases, including Google Scholar and PubMed, covering publications from January 2017 to September 2024. Studies involving transformer-derived models in medical tasks were included. Data were categorized into 6 key tasks.

RESULTS

Our key findings revealed both advancements and critical challenges in applying transformer-based models to health care tasks. For example, models like MedPIR involving dialogue generation show promise but face privacy and ethical concerns, while question-answering models like BioBERT improve accuracy but struggle with the complexity of medical terminology. The BioBERTSum summarization model aids clinicians by condensing medical texts but needs better handling of long sequences.

CONCLUSIONS

This review attempted to provide a consolidated understanding of the role of transformer-based language models in health care and to guide future research directions. By addressing current challenges and exploring the potential for real-world applications, we envision significant improvements in health care informatics. Addressing the identified challenges and implementing proposed solutions can enable transformer-based language models to significantly improve health care delivery and patient outcomes. Our review provides valuable insights for future research and practical applications, setting the stage for transformative advancements in medical informatics.

Collapse

Affiliation(s)

Ha Na Cho Department of Information Medicine, Asan Medical Center, Seoul, Republic of Korea
Tae Joon Jun Big Data Research Center, Asan Institute for Life Sciences, Asan Medical Center, Seoul, Republic of Korea
Young-Hak Kim Division of Cardiology, Department of Information Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
Heejun Kang Division of Cardiology, Asan Medical Center, Seoul, Republic of Korea
Imjin Ahn Department of Information Medicine, Asan Medical Center, Seoul, Republic of Korea
Hansle Gwon Department of Information Medicine, Asan Medical Center, Seoul, Republic of Korea
Yunha Kim Department of Medical Science, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
Jiahn Seo Department of Medical Science, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
Heejung Choi Department of Medical Science, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
Minkyoung Kim Department of Medical Science, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
Jiye Han Department of Medical Science, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
Gaeun Kee Department of Information Medicine, Asan Medical Center, Seoul, Republic of Korea
Seohyun Park Department of Information Medicine, Asan Medical Center, Seoul, Republic of Korea
Soyoung Ko Department of Information Medicine, Asan Medical Center, Seoul, Republic of Korea

Collapse

Bazoge A, Morin E, Daille B, Gourraud PA. Applying Natural Language Processing to Textual Data From Clinical Data Warehouses: Systematic Review. JMIR Med Inform 2023;11:e42477. [PMID: 38100200 PMCID: PMC10757232 DOI: 10.2196/42477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 01/16/2023] [Accepted: 09/07/2023] [Indexed: 12/17/2023] Open

Abstract

BACKGROUND

In recent years, health data collected during the clinical care process have been often repurposed for secondary use through clinical data warehouses (CDWs), which interconnect disparate data from different sources. A large amount of information of high clinical value is stored in unstructured text format. Natural language processing (NLP), which implements algorithms that can operate on massive unstructured textual data, has the potential to structure the data and make clinical information more accessible.

OBJECTIVE

The aim of this review was to provide an overview of studies applying NLP to textual data from CDWs. It focuses on identifying the (1) NLP tasks applied to data from CDWs and (2) NLP methods used to tackle these tasks.

METHODS

This review was performed according to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. We searched for relevant articles in 3 bibliographic databases: PubMed, Google Scholar, and ACL Anthology. We reviewed the titles and abstracts and included articles according to the following inclusion criteria: (1) focus on NLP applied to textual data from CDWs, (2) articles published between 1995 and 2021, and (3) written in English.

RESULTS

We identified 1353 articles, of which 194 (14.34%) met the inclusion criteria. Among all identified NLP tasks in the included papers, information extraction from clinical text (112/194, 57.7%) and the identification of patients (51/194, 26.3%) were the most frequent tasks. To address the various tasks, symbolic methods were the most common NLP methods (124/232, 53.4%), showing that some tasks can be partially achieved with classical NLP techniques, such as regular expressions or pattern matching that exploit specialized lexica, such as drug lists and terminologies. Machine learning (70/232, 30.2%) and deep learning (38/232, 16.4%) have been increasingly used in recent years, including the most recent approaches based on transformers. NLP methods were mostly applied to English language data (153/194, 78.9%).

CONCLUSIONS

CDWs are central to the secondary use of clinical texts for research purposes. Although the use of NLP on data from CDWs is growing, there remain challenges in this field, especially with regard to languages other than English. Clinical NLP is an effective strategy for accessing, extracting, and transforming data from CDWs. Information retrieved with NLP can assist in clinical research and have an impact on clinical practice.

Collapse

Luo M, Li S, Pang Y, Yao L, Ma R, Huang HY, Huang HD, Lee TY. Extraction of microRNA-target interaction sentences from biomedical literature by deep learning approach. Brief Bioinform 2023;24:6847797. [PMID: 36440972 DOI: 10.1093/bib/bbac497] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 10/16/2022] [Accepted: 10/19/2022] [Indexed: 11/29/2022] Open

Chen X, Xie H, Li Z, Zhang D, Cheng G, Wang FL, Dai HN, Li Q. Leveraging deep learning for automatic literature screening in intelligent bibliometrics. INT J MACH LEARN CYB 2022. [DOI: 10.1007/s13042-022-01710-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Lara-Clares A, Lastra-Díaz JJ, Garcia-Serrano A. A reproducible experimental survey on biomedical sentence similarity: A string-based method sets the state of the art. PLoS One 2022;17:e0276539. [PMID: 36409715 PMCID: PMC9678326 DOI: 10.1371/journal.pone.0276539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 10/08/2022] [Indexed: 11/22/2022] Open

Abstract

This registered report introduces the largest, and for the first time, reproducible experimental survey on biomedical sentence similarity with the following aims: (1) to elucidate the state of the art of the problem; (2) to solve some reproducibility problems preventing the evaluation of most current methods; (3) to evaluate several unexplored sentence similarity methods; (4) to evaluate for the first time an unexplored benchmark, called Corpus-Transcriptional-Regulation (CTR); (5) to carry out a study on the impact of the pre-processing stages and Named Entity Recognition (NER) tools on the performance of the sentence similarity methods; and finally, (6) to bridge the lack of software and data reproducibility resources for methods and experiments in this line of research. Our reproducible experimental survey is based on a single software platform, which is provided with a detailed reproducibility protocol and dataset as supplementary material to allow the exact replication of all our experiments and results. In addition, we introduce a new aggregated string-based sentence similarity method, called LiBlock, together with eight variants of current ontology-based methods, and a new pre-trained word embedding model trained on the full-text articles in the PMC-BioC corpus. Our experiments show that our novel string-based measure establishes the new state of the art in sentence similarity analysis in the biomedical domain and significantly outperforms all the methods evaluated herein, with the only exception of one ontology-based method. Likewise, our experiments confirm that the pre-processing stages, and the choice of the NER tool for ontology-based methods, have a very significant impact on the performance of the sentence similarity methods. We also detail some drawbacks and limitations of current methods, and highlight the need to refine the current benchmarks. Finally, a notable finding is that our new string-based method significantly outperforms all state-of-the-art Machine Learning (ML) models evaluated herein.

Collapse

Zhang L, Lu W, Chen H, Huang Y, Cheng Q. A comparative evaluation of biomedical similar article recommendation. J Biomed Inform 2022;131:104106. [PMID: 35661818 DOI: 10.1016/j.jbi.2022.104106] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Revised: 05/27/2022] [Accepted: 05/28/2022] [Indexed: 11/28/2022]

Abstract

BACKGROUND

Biomedical sciences, with their focus on human health and disease, have attracted unprecedented attention in the 21st century. The proliferation of biomedical sciences has also led to a large number of scientific articles being produced, which makes it difficult for biomedical researchers to find relevant articles and hinders the dissemination of valuable discoveries. To bridge this gap, the research community has initiated the article recommendation task, with the aim of recommending articles to biomedical researchers automatically based on their research interests. Over the past two decades, many recommendation methods have been developed. However, an algorithm-level comparison and rigorous evaluation of the most important methods on a shared dataset is still lacking.

METHOD

In this study, we first investigate 15 methods for automated article recommendation in the biomedical domain. We then conduct an empirical evaluation of the 15 methods, including six term-based methods, two word embedding methods, three sentence embedding methods, two document embedding methods, and two BERT-based methods. These methods are evaluated in two scenarios: article-oriented recommenders and user-oriented recommenders, with two publicly available datasets: TREC 2005 Genomics and RELISH, respectively.

RESULTS

Our experimental results show that the text representation models BERT and BioSenVec outperform many existing recommendation methods (e.g., BM25, PMRA, XPRC) and web-based recommendation systems (e.g., MScanner, MedlineRanker, BioReader) on both datasets regarding most of the evaluation metrics, and fine-tuning can improve the performance of the BERT-based methods.

CONCLUSIONS

Our comparison study is useful for researchers and practitioners in selecting the best modeling strategies for building article recommendation systems in the biomedical domain. The code and datasets are publicly available.

Collapse

Manifold biomedical text sentence embedding. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.04.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Clinical language search algorithm from free-text: facilitating appropriate imaging. BMC Med Imaging 2022;22:18. [PMID: 35120466 PMCID: PMC8815252 DOI: 10.1186/s12880-022-00740-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Accepted: 01/18/2022] [Indexed: 11/22/2022] Open

Abstract

Background

The comprehensiveness and maintenance of the American College of Radiology (ACR) Appropriateness Criteria (AC) makes it a unique resource for evidence-based clinical imaging decision support, but it is underutilized by clinicians. To facilitate the use of imaging recommendations, we develop a natural language processing (NLP) search algorithm that automatically matches clinical indications that physicians write into imaging orders to appropriate AC imaging recommendations.

Methods

We apply a hybrid model of semantic similarity from a sent2vec model trained on 223 million scientific sentences, combined with term frequency inverse document frequency features. AC documents are ranked based on their embeddings’ cosine distance to query. For model testing, we compiled a dataset of simulated simple and complex indications for each AC document (n = 410) and another with clinical indications from randomly sampled radiology reports (n = 100). We compare our algorithm to a custom google search engine.

Results

On the simulated indications, our algorithm ranked ground truth documents as top 3 for 98% of simple queries and 85% of complex queries. Similarly, on the randomly sampled radiology report dataset, the algorithm ranked 86% of indications with a single match as top 3. Vague and distracting phrases present in the free-text indications were main sources of errors. Our algorithm provides more relevant results than a custom Google search engine, especially for complex queries.

Conclusions

We have developed and evaluated an NLP algorithm that matches clinical indications to appropriate AC guidelines. This approach can be integrated into imaging ordering systems for automated access to guidelines.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12880-022-00740-6.

Collapse

Wan C, Ge X, Wang J, Zhang X, Yu Y, Hu J, Liu Y, Ma H. Identification and Impact Analysis of Family History of Psychiatric Disorder in Mood Disorder Patients With Pretrained Language Model. Front Psychiatry 2022;13:861930. [PMID: 35669265 PMCID: PMC9163373 DOI: 10.3389/fpsyt.2022.861930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Accepted: 04/20/2022] [Indexed: 11/13/2022] Open

Chen Q, Rankine A, Peng Y, Aghaarabi E, Lu Z. Benchmarking Effectiveness and Efficiency of Deep Learning Models for Semantic Textual Similarity in the Clinical Domain: Validation Study. JMIR Med Inform 2021;9:e27386. [PMID: 34967748 PMCID: PMC8759018 DOI: 10.2196/27386] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 08/06/2021] [Accepted: 08/06/2021] [Indexed: 01/23/2023] Open

Abstract

Background

Semantic textual similarity (STS) measures the degree of relatedness between sentence pairs. The Open Health Natural Language Processing (OHNLP) Consortium released an expertly annotated STS data set and called for the National Natural Language Processing Clinical Challenges. This work describes our entry, an ensemble model that leverages a range of deep learning (DL) models. Our team from the National Library of Medicine obtained a Pearson correlation of 0.8967 in an official test set during 2019 National Natural Language Processing Clinical Challenges/Open Health Natural Language Processing shared task and achieved a second rank.

Objective

Although our models strongly correlate with manual annotations, annotator-level correlation was only moderate (weighted Cohen κ=0.60). We are cautious of the potential use of DL models in production systems and argue that it is more critical to evaluate the models in-depth, especially those with extremely high correlations. In this study, we benchmark the effectiveness and efficiency of top-ranked DL models. We quantify their robustness and inference times to validate their usefulness in real-time applications.

Methods

We benchmarked five DL models, which are the top-ranked systems for STS tasks: Convolutional Neural Network, BioSentVec, BioBERT, BlueBERT, and ClinicalBERT. We evaluated a random forest model as an additional baseline. For each model, we repeated the experiment 10 times, using the official training and testing sets. We reported 95% CI of the Wilcoxon rank-sum test on the average Pearson correlation (official evaluation metric) and running time. We further evaluated Spearman correlation, R², and mean squared error as additional measures.

Results

Using only the official training set, all models obtained highly effective results. BioSentVec and BioBERT achieved the highest average Pearson correlations (0.8497 and 0.8481, respectively). BioSentVec also had the highest results in 3 of 4 effectiveness measures, followed by BioBERT. However, their robustness to sentence pairs of different similarity levels varies significantly. A particular observation is that BERT models made the most errors (a mean squared error of over 2.5) on highly similar sentence pairs. They cannot capture highly similar sentence pairs effectively when they have different negation terms or word orders. In addition, time efficiency is dramatically different from the effectiveness results. On average, the BERT models were approximately 20 times and 50 times slower than the Convolutional Neural Network and BioSentVec models, respectively. This results in challenges for real-time applications.

Conclusions

Despite the excitement of further improving Pearson correlations in this data set, our results highlight that evaluations of the effectiveness and efficiency of STS models are critical. In future, we suggest more evaluations on the generalization capability and user-level testing of the models. We call for community efforts to create more biomedical and clinical STS data sets from different perspectives to reflect the multifaceted notion of sentence-relatedness.

Collapse

Sentence representation with manifold learning for biomedical texts. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.106869] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Lara-Clares A, Lastra-Díaz JJ, Garcia-Serrano A. Protocol for a reproducible experimental survey on biomedical sentence similarity. PLoS One 2021;16:e0248663. [PMID: 33760855 PMCID: PMC7990182 DOI: 10.1371/journal.pone.0248663] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 03/02/2021] [Indexed: 11/28/2022] Open

Abstract

Measuring semantic similarity between sentences is a significant task in the fields of Natural Language Processing (NLP), Information Retrieval (IR), and biomedical text mining. For this reason, the proposal of sentence similarity methods for the biomedical domain has attracted a lot of attention in recent years. However, most sentence similarity methods and experimental results reported in the biomedical domain cannot be reproduced for multiple reasons as follows: the copying of previous results without confirmation, the lack of source code and data to replicate both methods and experiments, and the lack of a detailed definition of the experimental setup, among others. As a consequence of this reproducibility gap, the state of the problem can be neither elucidated nor new lines of research be soundly set. On the other hand, there are other significant gaps in the literature on biomedical sentence similarity as follows: (1) the evaluation of several unexplored sentence similarity methods which deserve to be studied; (2) the evaluation of an unexplored benchmark on biomedical sentence similarity, called Corpus-Transcriptional-Regulation (CTR); (3) a study on the impact of the pre-processing stage and Named Entity Recognition (NER) tools on the performance of the sentence similarity methods; and finally, (4) the lack of software and data resources for the reproducibility of methods and experiments in this line of research. Identified these open problems, this registered report introduces a detailed experimental setup, together with a categorization of the literature, to develop the largest, updated, and for the first time, reproducible experimental survey on biomedical sentence similarity. Our aforementioned experimental survey will be based on our own software replication and the evaluation of all methods being studied on the same software platform, which will be specially developed for this work, and it will become the first publicly available software library for biomedical sentence similarity. Finally, we will provide a very detailed reproducibility protocol and dataset as supplementary material to allow the exact replication of all our experiments and results.

Collapse

Liu H, Zhang Z, Xu Y, Wang N, Huang Y, Yang Z, Jiang R, Chen H. Use of BERT (Bidirectional Encoder Representations from Transformers)-Based Deep Learning Method for Extracting Evidences in Chinese Radiology Reports: Development of a Computer-Aided Liver Cancer Diagnosis Framework. J Med Internet Res 2021;23:e19689. [PMID: 33433395 PMCID: PMC7837998 DOI: 10.2196/19689] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Revised: 06/30/2020] [Accepted: 11/11/2020] [Indexed: 01/17/2023] Open

Abstract

Background

Liver cancer is a substantial disease burden in China. As one of the primary diagnostic tools for detecting liver cancer, dynamic contrast-enhanced computed tomography provides detailed evidences for diagnosis that are recorded in free-text radiology reports.

Objective

The aim of our study was to apply a deep learning model and rule-based natural language processing (NLP) method to identify evidences for liver cancer diagnosis automatically.

Methods

We proposed a pretrained, fine-tuned BERT (Bidirectional Encoder Representations from Transformers)-based BiLSTM-CRF (Bidirectional Long Short-Term Memory-Conditional Random Field) model to recognize the phrases of APHE (hyperintense enhancement in the arterial phase) and PDPH (hypointense in the portal and delayed phases). To identify more essential diagnostic evidences, we used the traditional rule-based NLP methods for the extraction of radiological features. APHE, PDPH, and other extracted radiological features were used to design a computer-aided liver cancer diagnosis framework by random forest.

Results

The BERT-BiLSTM-CRF predicted the phrases of APHE and PDPH with an F1 score of 98.40% and 90.67%, respectively. The prediction model using combined features had a higher performance (F1 score, 88.55%) than those using APHE and PDPH (84.88%) or other extracted radiological features (83.52%). APHE and PDPH were the top 2 essential features for liver cancer diagnosis.

Conclusions

This work was a comprehensive NLP study, wherein we identified evidences for the diagnosis of liver cancer from Chinese radiology reports, considering both clinical knowledge and radiology findings. The BERT-based deep learning method for the extraction of diagnostic evidence achieved state-of-the-art performance. The high performance proves the feasibility of the BERT-BiLSTM-CRF model in information extraction from Chinese radiology reports. The findings of our study suggest that the deep learning–based method for automatically identifying evidences for diagnosis can be extended to other types of Chinese clinical texts.

Collapse

Yang X, He X, Zhang H, Ma Y, Bian J, Wu Y. Measurement of Semantic Textual Similarity in Clinical Texts: Comparison of Transformer-Based Models. JMIR Med Inform 2020;8:e19735. [PMID: 33226350 PMCID: PMC7721552 DOI: 10.2196/19735] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Revised: 10/19/2020] [Accepted: 10/26/2020] [Indexed: 12/31/2022] Open

Abstract

BACKGROUND

Semantic textual similarity (STS) is one of the fundamental tasks in natural language processing (NLP). Many shared tasks and corpora for STS have been organized and curated in the general English domain; however, such resources are limited in the biomedical domain. In 2019, the National NLP Clinical Challenges (n2c2) challenge developed a comprehensive clinical STS dataset and organized a community effort to solicit state-of-the-art solutions for clinical STS.

OBJECTIVE

This study presents our transformer-based clinical STS models developed during this challenge as well as new models we explored after the challenge. This project is part of the 2019 n2c2/Open Health NLP shared task on clinical STS.

METHODS

In this study, we explored 3 transformer-based models for clinical STS: Bidirectional Encoder Representations from Transformers (BERT), XLNet, and Robustly optimized BERT approach (RoBERTa). We examined transformer models pretrained using both general English text and clinical text. We also explored using a general English STS dataset as a supplementary corpus in addition to the clinical training set developed in this challenge. Furthermore, we investigated various ensemble methods to combine different transformer models.

RESULTS

Our best submission based on the XLNet model achieved the third-best performance (Pearson correlation of 0.8864) in this challenge. After the challenge, we further explored other transformer models and improved the performance to 0.9065 using a RoBERTa model, which outperformed the best-performing system developed in this challenge (Pearson correlation of 0.9010).

CONCLUSIONS

This study demonstrated the efficiency of utilizing transformer-based models to measure semantic similarity for clinical text. Our models can be applied to clinical applications such as clinical text deduplication and summarization.

Collapse

García-Díaz JA, Cánovas-García M, Valencia-García R. Ontology-driven aspect-based sentiment analysis classification: An infodemiological case study regarding infectious diseases in Latin America. FUTURE GENERATIONS COMPUTER SYSTEMS : FGCS 2020;112:641-657. [PMID: 32572291 PMCID: PMC7301140 DOI: 10.1016/j.future.2020.06.019] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Revised: 06/12/2020] [Accepted: 06/14/2020] [Indexed: 06/11/2023]

Can We Survive without Labelled Data in NLP? Transfer Learning for Open Information Extraction. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10175758] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]