1
|
Scherbakov D, Hubig N, Jansari V, Bakumenko A, Lenert LA. The emergence of large language models as tools in literature reviews: a large language model-assisted systematic review. J Am Med Inform Assoc 2025; 32:1071-1086. [PMID: 40332983 PMCID: PMC12089777 DOI: 10.1093/jamia/ocaf063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2025] [Revised: 04/02/2025] [Accepted: 04/11/2025] [Indexed: 05/08/2025] Open
Abstract
OBJECTIVES This study aims to summarize the usage of large language models (LLMs) in the process of creating a scientific review by looking at the methodological papers that describe the use of LLMs in review automation and the review papers that mention they were made with the support of LLMs. MATERIALS AND METHODS The search was conducted in June 2024 in PubMed, Scopus, Dimensions, and Google Scholar by human reviewers. Screening and extraction process took place in Covidence with the help of LLM add-on based on the OpenAI GPT-4o model. ChatGPT and Scite.ai were used in cleaning the data, generating the code for figures, and drafting the manuscript. RESULTS Of the 3788 articles retrieved, 172 studies were deemed eligible for the final review. ChatGPT and GPT-based LLM emerged as the most dominant architecture for review automation (n = 126, 73.2%). A significant number of review automation projects were found, but only a limited number of papers (n = 26, 15.1%) were actual reviews that acknowledged LLM usage. Most citations focused on the automation of a particular stage of review, such as Searching for publications (n = 60, 34.9%) and Data extraction (n = 54, 31.4%). When comparing the pooled performance of GPT-based and BERT-based models, the former was better in data extraction with a mean precision of 83.0% (SD = 10.4) and a recall of 86.0% (SD = 9.8). DISCUSSION AND CONCLUSION Our LLM-assisted systematic review revealed a significant number of research projects related to review automation using LLMs. Despite limitations, such as lower accuracy of extraction for numeric data, we anticipate that LLMs will soon change the way scientific reviews are conducted.
Collapse
Affiliation(s)
- Dmitry Scherbakov
- Biomedical Informatics Center, Department of Public Health Sciences, Medical University of South Carolina (MUSC), Charleston, SC 29403, United States
| | - Nina Hubig
- Biomedical Informatics Center, Department of Public Health Sciences, Medical University of South Carolina (MUSC), Charleston, SC 29403, United States
- Interdisciplinary Transformation University, OG 2 A-4040 Linz, Austria
| | - Vinita Jansari
- School of Computing, Clemson University, Charleston, SC 29634, United States
| | - Alexander Bakumenko
- School of Computing, Clemson University, Charleston, SC 29634, United States
| | - Leslie A Lenert
- Biomedical Informatics Center, Department of Public Health Sciences, Medical University of South Carolina (MUSC), Charleston, SC 29403, United States
| |
Collapse
|
2
|
Chen F, Zhang G, Fang Y, Peng Y, Weng C. Semi-supervised learning from small annotated data and large unlabeled data for fine-grained Participants, Intervention, Comparison, and Outcomes entity recognition. J Am Med Inform Assoc 2025; 32:555-565. [PMID: 39823371 PMCID: PMC11833487 DOI: 10.1093/jamia/ocae326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2024] [Revised: 12/16/2024] [Accepted: 12/30/2024] [Indexed: 01/19/2025] Open
Abstract
OBJECTIVE Extracting PICO elements-Participants, Intervention, Comparison, and Outcomes-from clinical trial literature is essential for clinical evidence retrieval, appraisal, and synthesis. Existing approaches do not distinguish the attributes of PICO entities. This study aims to develop a named entity recognition (NER) model to extract PICO entities with fine granularities. MATERIALS AND METHODS Using a corpus of 2511 abstracts with PICO mentions from 4 public datasets, we developed a semi-supervised method to facilitate the training of a NER model, FinePICO, by combining limited annotated data of PICO entities and abundant unlabeled data. For evaluation, we divided the entire dataset into 2 subsets: a smaller group with annotations and a larger group without annotations. We then established the theoretical lower and upper performance bounds based on the performance of supervised learning models trained solely on the small, annotated subset and on the entire set with complete annotations, respectively. Finally, we evaluated FinePICO on both the smaller annotated subset and the larger, initially unannotated subset. We measured the performance of FinePICO using precision, recall, and F1. RESULTS Our method achieved precision/recall/F1 of 0.567/0.636/0.60, respectively, using a small set of annotated samples, outperforming the baseline model (F1: 0.437) by more than 16%. The model demonstrates generalizability to a different PICO framework and to another corpus, which consistently outperforms the benchmark in diverse experimental settings (P-value < .001). DISCUSSION We developed FinePICO to recognize fine-grained PICO entities from text and validated its performance across diverse experimental settings, highlighting the feasibility of using semi-supervised learning (SSL) techniques to enhance PICO entities extraction. Future work can focus on optimizing SSL algorithms to improve efficiency and reduce computational costs. CONCLUSION This study contributes a generalizable and effective semi-supervised approach leveraging large unlabeled data together with small, annotated data for fine-grained PICO extraction.
Collapse
Affiliation(s)
- Fangyi Chen
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, United States
| | - Gongbo Zhang
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, United States
| | - Yilu Fang
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, United States
| | - Yifan Peng
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY 10065, United States
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, United States
| |
Collapse
|
3
|
Pillay TS, Topcu Dİ, Yenice S. Harnessing AI for enhanced evidence-based laboratory medicine (EBLM). Clin Chim Acta 2025; 569:120181. [PMID: 39909187 DOI: 10.1016/j.cca.2025.120181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2024] [Revised: 01/31/2025] [Accepted: 02/02/2025] [Indexed: 02/07/2025]
Abstract
The integration of artificial intelligence (AI) into laboratory medicine, is revolutionizing diagnostic accuracy, operational efficiency, and personalized patient care. AI technologies(machine learning, natural language processing and computer vision) advance evidence-based laboratory medicine (EBLM) by automating and optimizing critical processes(formulating clinical questions, conducting literature searches, appraising evidence, and developing clinical guidelines). These reduce the time for systematic reviews, ensuring consistency in appraisal, and enabling real-time updates to guidelines. AI supports personalized medicine by analyzing large datasets, genetic information and electronic health records (EHRs), to tailor diagnostic and treatment plans to patient profiles. Predictive analytics enhance outcomes by leveraging historical data and ongoing monitoring to predict responses and optimize care pathways. Despite the transformative potential, there are challenges. The accuracy, transparency, and explainability of AI algorithms is critical for gaining trust and ensuring ethical deployment. Integration into existing clinical workflows requires collaboration between AI developers and users to ensure seamless user-friendly adoption. Ethical considerations, such as privacy,data security, and algorithmic bias, must also be addressed to mitigate risks and ensure equitable healthcare delivery. Regulatory frameworks, eg. The EU AI Regulation, emphasize transparency, data governance, and human oversight, particularly for high-risk AI systems. The economic and operational benefits are cost savings, improved diagnostic precision, and enhanced patient outcomes. Future trends (federated learning and self-supervised learning), will enhance the scalability and applicability of AI in EBLM, paving the way for a new era of precision medicine. AI in EBLM has the potential to transform healthcare delivery, improve patient outcomes, and advance personalized/precision medicine.
Collapse
Affiliation(s)
- Tahir S Pillay
- Department of Chemical Pathology, Faculty of Health Sciences and National Health Laboratory Service Tshwane Academic Division, University of Pretoria, Pretoria, South Africa; Division of Chemical Pathology ,University of Cape Town, Cape Town, South Africa.
| | - Deniz İlhan Topcu
- Department of Medical Biochemistry, İzmir City Hospital, İzmir, Türkiye
| | - Sedef Yenice
- Group Florence Nightingale Hospitals, Istanbul, Türkiye
| |
Collapse
|
4
|
Sugiura A, Saegusa S, Jin Y, Yoshimoto R, Smith ND, Dohi K, Higuchi T, Kozu T. Evaluation of RMES, an Automated Software Tool Utilizing AI, for Literature Screening with Reference to Published Systematic Reviews as Case-Studies: Development and Usability Study. JMIR Form Res 2024; 8:e55827. [PMID: 39652380 PMCID: PMC11667133 DOI: 10.2196/55827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Revised: 04/09/2024] [Accepted: 09/06/2024] [Indexed: 12/12/2024] Open
Abstract
BACKGROUND Systematic reviews and meta-analyses are important to evidence-based medicine, but the information retrieval and literature screening procedures are burdensome tasks. Rapid Medical Evidence Synthesis (RMES; Deloitte Tohmatsu Risk Advisory LLC) is a software designed to support information retrieval, literature screening, and data extraction for evidence-based medicine. OBJECTIVE This study aimed to evaluate the accuracy of RMES for literature screening with reference to published systematic reviews. METHODS We used RMES to automatically screen the titles and abstracts of PubMed-indexed articles included in 12 systematic reviews across 6 medical fields, by applying 4 filters: (1) study type; (2) study type + disease; (3) study type + intervention; and (4) study type + disease + intervention. We determined the numbers of articles correctly included by each filter relative to those included by the authors of each systematic review. Only PubMed-indexed articles were assessed. RESULTS Across the 12 reviews, the number of articles analyzed by RMES ranged from 46 to 5612. The number of PubMed-cited articles included in the reviews ranged from 4 to 47. The median (range) percentage of articles correctly labeled by RMES using filters 1-4 were: 80.9% (57.1%-100%), 65.2% (34.1%-81.8%), 70.5% (0%-100%), and 58.6% (0%-81.8%), respectively. CONCLUSIONS This study demonstrated good performance and accuracy of RMES for the initial screening of the titles and abstracts of articles for use in systematic reviews. RMES has the potential to reduce the workload involved in the initial screening of published studies.
Collapse
Affiliation(s)
- Ayaka Sugiura
- Deloitte Analytics, Deloitte Tohmatsu Risk Advisory LLC, Tokyo, Japan
| | - Satoshi Saegusa
- Deloitte Analytics, Deloitte Tohmatsu Risk Advisory LLC, Tokyo, Japan
| | - Yingzi Jin
- Deloitte Analytics, Deloitte Tohmatsu Risk Advisory LLC, Tokyo, Japan
| | - Riki Yoshimoto
- Evidence Generation & Communication Division, EMC K.K., Tokyo, Japan
| | - Nicholas D Smith
- Evidence Generation & Communication Division, EMC K.K., Tokyo, Japan
| | - Koji Dohi
- Evidence Generation & Communication Division, EMC K.K., Tokyo, Japan
| | - Tadashi Higuchi
- Evidence Generation & Communication Division, EMC K.K., Tokyo, Japan
| | - Tomotake Kozu
- Deloitte Analytics, Deloitte Tohmatsu Risk Advisory LLC, Tokyo, Japan
| |
Collapse
|
5
|
Benson R, Elia M, Hyams B, Chang JH, Hong JC. A Narrative Review on the Application of Large Language Models to Support Cancer Care and Research. Yearb Med Inform 2024; 33:90-98. [PMID: 40199294 PMCID: PMC12020524 DOI: 10.1055/s-0044-1800726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/10/2025] Open
Abstract
OBJECTIVES The emergence of large language models has resulted in a significant shift in informatics research and carries promise in clinical cancer care. Here we provide a narrative review of the recent use of large language models (LLMs) to support cancer care, prevention, and research. METHODS We performed a search of the Scopus database for studies on the application of bidirectional encoder representations from transformers (BERT) and generative-pretrained transformer (GPT) LLMs in cancer care published between the start of 2021 and the end of 2023. We present salient and impactful papers related to each of these themes. RESULTS Studies identified focused on aspects of clinical decision support (CDS), cancer education, and support for research activities. The use of LLMs for CDS primarily focused on aspects of treatment and screening planning, treatment response, and the management of adverse events. Studies using LLMs for cancer education typically focused on question-answering, assessing cancer myths and misconceptions, and text summarization and simplification. Finally, studies using LLMs to support research activities focused on scientific writing and idea generation, cohort identification and extraction, clinical data processing, and NLP-centric tasks. CONCLUSIONS The application of LLMs in cancer care has shown promise across a variety of diverse use cases. Future research should utilize quantitative metrics, qualitative insights, and user insights in the development and evaluation of LLM-based cancer care tools. The development of open-source LLMs for use in cancer care research and activities should also be a priority.
Collapse
Affiliation(s)
- Ryzen Benson
- Department of Radiation Oncology, University of California, San Francisco, San Francisco, California
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, California
| | - Marianna Elia
- Department of Radiation Oncology, University of California, San Francisco, San Francisco, California
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, California
| | - Benjamin Hyams
- Department of Radiation Oncology, University of California, San Francisco, San Francisco, California
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, California
- School of Medicine, University of California, San Francisco, San Francisco, California
| | - Ji Hyun Chang
- Department of Radiation Oncology, University of California, San Francisco, San Francisco, California
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, California
- Department of Radiation Oncology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Korea
| | - Julian C. Hong
- Department of Radiation Oncology, University of California, San Francisco, San Francisco, California
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, California
- UCSF UC Berkeley Joint Program in Computational Precision Health (CPH), San Francisco, CA
| |
Collapse
|
6
|
Berger-Tal O, Wong BBM, Adams CA, Blumstein DT, Candolin U, Gibson MJ, Greggor AL, Lagisz M, Macura B, Price CJ, Putman BJ, Snijders L, Nakagawa S. Leveraging AI to improve evidence synthesis in conservation. Trends Ecol Evol 2024; 39:548-557. [PMID: 38796352 DOI: 10.1016/j.tree.2024.04.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 04/18/2024] [Accepted: 04/22/2024] [Indexed: 05/28/2024]
Abstract
Systematic evidence syntheses (systematic reviews and maps) summarize knowledge and are used to support decisions and policies in a variety of applied fields, from medicine and public health to biodiversity conservation. However, conducting these exercises in conservation is often expensive and slow, which can impede their use and hamper progress in addressing the current biodiversity crisis. With the explosive growth of large language models (LLMs) and other forms of artificial intelligence (AI), we discuss here the promise and perils associated with their use. We conclude that, when judiciously used, AI has the potential to speed up and hopefully improve the process of evidence synthesis, which can be particularly useful for underfunded applied fields, such as conservation science.
Collapse
Affiliation(s)
- Oded Berger-Tal
- Mitrani Department of Desert Ecology, Jacob Blaustein Institutes for Desert Research, Ben-Gurion University of the Negev, Midreshet Ben-Gurion 8499000, Israel.
| | - Bob B M Wong
- School of Biological Sciences, Monash University, Melbourne, VIC 3800, Australia.
| | - Carrie Ann Adams
- Department of Fish, Wildlife, and Conservation Biology, Colorado State University, 1474 Campus Delivery, Fort Collins, CO 80523-1474, USA
| | - Daniel T Blumstein
- Department of Ecology & Evolutionary Biology, University of California, 621 Young Drive South, Los Angeles, CA 90095-1606, USA
| | - Ulrika Candolin
- Organismal and Evolutionary Biology Research Programme, University of Helsinki, 00014 Helsinki, Finland
| | - Matthew J Gibson
- Evolution & Ecology Research Centre, Centre for Ecosystem Science, and School of Biological, Earth and Environmental Sciences, University of New South Wales, Sydney, NSW 2052, Australia
| | - Alison L Greggor
- Conservation Science and Wildlife Health, San Diego Zoo Wildlife Alliance, 15600 San Pasqual Valley Road, Escondido, CA 92027-7000, USA
| | - Malgorzata Lagisz
- Evolution & Ecology Research Centre, Centre for Ecosystem Science, and School of Biological, Earth and Environmental Sciences, University of New South Wales, Sydney, NSW 2052, Australia
| | - Biljana Macura
- Stockholm Environment Institute (HQ), Box 24218, Stockholm, 10451, Sweden
| | - Catherine J Price
- School of Life and Environmental Sciences, University of Sydney, NSW 2006, Australia
| | - Breanna J Putman
- Department of Biology, California State University, 5500 University Parkway, San Bernardino, CA 92407-2393, USA
| | - Lysanne Snijders
- Behavioural Ecology Group, Wageningen University & Research, De Elst 1, 6708 WD, Wageningen, The Netherlands
| | - Shinichi Nakagawa
- Evolution & Ecology Research Centre, Centre for Ecosystem Science, and School of Biological, Earth and Environmental Sciences, University of New South Wales, Sydney, NSW 2052, Australia.
| |
Collapse
|
7
|
Guo Q, Jiang G, Zhao Q, Long Y, Feng K, Gu X, Xu Y, Li Z, Huang J, Du L. Rapid review: A review of methods and recommendations based on current evidence. J Evid Based Med 2024; 17:434-453. [PMID: 38512942 DOI: 10.1111/jebm.12594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Accepted: 02/28/2024] [Indexed: 03/23/2024]
Abstract
Rapid review (RR) could accelerate the traditional systematic review (SR) process by simplifying or omitting steps using various shortcuts. With the increasing popularity of RR, numerous shortcuts had emerged, but there was no consensus on how to choose the most appropriate ones. This study conducted a literature search in PubMed from inception to December 21, 2023, using terms such as "rapid review" "rapid assessment" "rapid systematic review" and "rapid evaluation". We also scanned the reference lists and performed citation tracking of included impact studies to obtain more included studies. We conducted a narrative synthesis of all RR approaches, shortcuts and studies assessing their effectiveness at each stage of RRs. Based on the current evidence, we provided recommendations on utilizing certain shortcuts in RRs. Ultimately, we identified 185 studies focusing on summarizing RR approaches and shortcuts, or evaluating their impact. There was relatively sufficient evidence to support the use of the following shortcuts in RRs: limiting studies to those published in English-language; conducting abbreviated database searches (e.g., only searching PubMed/MEDLINE, Embase, and CENTRAL); omitting retrieval of grey literature; restricting the search timeframe to the recent 20 years for medical intervention and the recent 15 years for reviewing diagnostic test accuracy; conducting a single screening by an experienced screener. To some extent, the above shortcuts were also applicable to SRs. This study provided a reference for future RR researchers in selecting shortcuts, and it also presented a potential research topic for methodologists.
Collapse
Affiliation(s)
- Qiong Guo
- Innovation Institute for Integration of Medicine and Engineering, West China Hospital, Sichuan University, Chengdu, P. R. China
- West China Medical Publishers, West China Hospital, Sichuan University, Chengdu, P. R. China
| | - Guiyu Jiang
- West China School of Public Health, Sichuan University, Chengdu, P. R. China
| | - Qingwen Zhao
- West China School of Public Health, Sichuan University, Chengdu, P. R. China
| | - Youlin Long
- Innovation Institute for Integration of Medicine and Engineering, West China Hospital, Sichuan University, Chengdu, P. R. China
- Chinese Evidence-Based Medicine Center, West China Hospital, Sichuan University, Chengdu, P. R. China
| | - Kun Feng
- Innovation Institute for Integration of Medicine and Engineering, West China Hospital, Sichuan University, Chengdu, P. R. China
- Chinese Evidence-Based Medicine Center, West China Hospital, Sichuan University, Chengdu, P. R. China
| | - Xianlin Gu
- Innovation Institute for Integration of Medicine and Engineering, West China Hospital, Sichuan University, Chengdu, P. R. China
- Chinese Evidence-Based Medicine Center, West China Hospital, Sichuan University, Chengdu, P. R. China
| | - Yihan Xu
- Innovation Institute for Integration of Medicine and Engineering, West China Hospital, Sichuan University, Chengdu, P. R. China
- Chinese Evidence-Based Medicine Center, West China Hospital, Sichuan University, Chengdu, P. R. China
- Center for education of medical humanities, West China Hospital, Sichuan University, Chengdu, P. R. China
| | - Zhengchi Li
- Center for education of medical humanities, West China Hospital, Sichuan University, Chengdu, P. R. China
| | - Jin Huang
- Innovation Institute for Integration of Medicine and Engineering, West China Hospital, Sichuan University, Chengdu, P. R. China
| | - Liang Du
- Innovation Institute for Integration of Medicine and Engineering, West China Hospital, Sichuan University, Chengdu, P. R. China
- West China Medical Publishers, West China Hospital, Sichuan University, Chengdu, P. R. China
- Chinese Evidence-Based Medicine Center, West China Hospital, Sichuan University, Chengdu, P. R. China
| |
Collapse
|
8
|
Jacaruso L. Insights into the nutritional prevention of macular degeneration based on a comparative topic modeling approach. PeerJ Comput Sci 2024; 10:e1940. [PMID: 38660183 PMCID: PMC11042009 DOI: 10.7717/peerj-cs.1940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Accepted: 02/22/2024] [Indexed: 04/26/2024]
Abstract
Topic modeling and text mining are subsets of natural language processing (NLP) with relevance for conducting meta-analysis (MA) and systematic review (SR). For evidence synthesis, the above NLP methods are conventionally used for topic-specific literature searches or extracting values from reports to automate essential phases of SR and MA. Instead, this work proposes a comparative topic modeling approach to analyze reports of contradictory results on the same general research question. Specifically, the objective is to identify topics exhibiting distinct associations with significant results for an outcome of interest by ranking them according to their proportional occurrence in (and consistency of distribution across) reports of significant effects. Macular degeneration (MD) is a disease that affects millions of people annually, causing vision loss. Augmenting evidence synthesis to provide insight into MD prevention is therefore of central interest in this article. The proposed method was tested on broad-scope studies addressing whether supplemental nutritional compounds significantly benefit macular degeneration. Six compounds were identified as having a particular association with reports of significant results for benefiting MD. Four of these were further supported in terms of effectiveness upon conducting a follow-up literature search for validation (omega-3 fatty acids, copper, zeaxanthin, and nitrates). The two not supported by the follow-up literature search (niacin and molybdenum) also had scores in the lowest range under the proposed scoring system. Results therefore suggest that the proposed method's score for a given topic may be a viable proxy for its degree of association with the outcome of interest, and can be helpful in the systematic search for potentially causal relationships. Further, the compounds identified by the proposed method were not simultaneously captured as salient topics by state-of-the-art topic models that leverage document and word embeddings (Top2Vec) and transformer models (BERTopic). These results underpin the proposed method's potential to add specificity in understanding effects from broad-scope reports, elucidate topics of interest for future research, and guide evidence synthesis in a scalable way. All of this is accomplished while yielding valuable and actionable insights into the prevention of MD.
Collapse
Affiliation(s)
- Lucas Jacaruso
- University of Southern California, Los Angeles, CA, United States of America
| |
Collapse
|
9
|
Panayi A, Ward K, Benhadji-Schaff A, Ibanez-Lopez AS, Xia A, Barzilay R. Evaluation of a prototype machine learning tool to semi-automate data extraction for systematic literature reviews. Syst Rev 2023; 12:187. [PMID: 37803451 PMCID: PMC10557215 DOI: 10.1186/s13643-023-02351-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Accepted: 09/13/2023] [Indexed: 10/08/2023] Open
Abstract
BACKGROUND Evidence-based medicine requires synthesis of research through rigorous and time-intensive systematic literature reviews (SLRs), with significant resource expenditure for data extraction from scientific publications. Machine learning may enable the timely completion of SLRs and reduce errors by automating data identification and extraction. METHODS We evaluated the use of machine learning to extract data from publications related to SLRs in oncology (SLR 1) and Fabry disease (SLR 2). SLR 1 predominantly contained interventional studies and SLR 2 observational studies. Predefined key terms and data were manually annotated to train and test bidirectional encoder representations from transformers (BERT) and bidirectional long-short-term memory machine learning models. Using human annotation as a reference, we assessed the ability of the models to identify biomedical terms of interest (entities) and their relations. We also pretrained BERT on a corpus of 100,000 open access clinical publications and/or enhanced context-dependent entity classification with a conditional random field (CRF) model. Performance was measured using the F1 score, a metric that combines precision and recall. We defined successful matches as partial overlap of entities of the same type. RESULTS For entity recognition, the pretrained BERT+CRF model had the best performance, with an F1 score of 73% in SLR 1 and 70% in SLR 2. Entity types identified with the highest accuracy were metrics for progression-free survival (SLR 1, F1 score 88%) or for patient age (SLR 2, F1 score 82%). Treatment arm dosage was identified less successfully (F1 scores 60% [SLR 1] and 49% [SLR 2]). The best-performing model for relation extraction, pretrained BERT relation classification, exhibited F1 scores higher than 90% in cases with at least 80 relation examples for a pair of related entity types. CONCLUSIONS The performance of BERT is enhanced by pretraining with biomedical literature and by combining with a CRF model. With refinement, machine learning may assist with manual data extraction for SLRs.
Collapse
Affiliation(s)
- Antonia Panayi
- Takeda Pharmaceuticals International AG, Thurgauerstrasse 130, 8152, Glattpark-Opfikon, Zurich, Switzerland.
| | | | | | | | - Andrew Xia
- Takeda Pharmaceuticals International AG, Thurgauerstrasse 130, 8152, Glattpark-Opfikon, Zurich, Switzerland
| | | |
Collapse
|
10
|
Whitton J, Hunter A. Automated tabulation of clinical trial results: A joint entity and relation extraction approach with transformer-based language representations. Artif Intell Med 2023; 144:102661. [PMID: 37783549 DOI: 10.1016/j.artmed.2023.102661] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 07/05/2023] [Accepted: 09/04/2023] [Indexed: 10/04/2023]
Abstract
Evidence-based medicine, the practice in which healthcare professionals refer to the best available evidence when making decisions, forms the foundation of modern healthcare. However, it relies on labour-intensive systematic reviews, where domain specialists must aggregate and extract information from thousands of publications, primarily of randomised controlled trial (RCT) results, into evidence tables. This paper investigates automating evidence table generation by decomposing the problem across two language processing tasks: named entity recognition, which identifies key entities within text, such as drug names, and relation extraction, which maps their relationships for separating them into ordered tuples. We focus on the automatic tabulation of sentences from published RCT abstracts that report the results of the study outcomes. Two deep neural net models were developed as part of a joint extraction pipeline, using the principles of transfer learning and transformer-based language representations. To train and test these models, a new gold-standard corpus was developed, comprising over 550 result sentences from six disease areas. This approach demonstrated significant advantages, with our system performing well across multiple natural language processing tasks and disease areas, as well as in generalising to disease domains unseen during training. Furthermore, we show these results were achievable through training our models on as few as 170 example sentences. The final system is a proof of concept that the generation of evidence tables can be semi-automated, representing a step towards fully automating systematic reviews.
Collapse
Affiliation(s)
- Jetsun Whitton
- Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, UK.
| | - Anthony Hunter
- Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, UK.
| |
Collapse
|
11
|
Surgical procedure long terms recognition from Chinese literature incorporating structural feature. Heliyon 2022; 8:e11291. [PMID: 36387477 PMCID: PMC9640963 DOI: 10.1016/j.heliyon.2022.e11291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Revised: 10/04/2022] [Accepted: 10/24/2022] [Indexed: 11/06/2022] Open
Abstract
With rapid development of technologies in medical diagnosis and treatment, the novel and complicated concepts and usages of clinical terms especially of surgical procedures have become common in daily routine. Expected to be performed in an operating room and accompanied by an incision based on expert discretion, surgical procedures imply clinical understanding of diagnosis, examination, testing, equipment, drugs and symptoms, etc., but terms expressing surgical procedures are difficult to recognize since the terms are highly distinctive due to long morphological length and complex linguistics phenomena. To achieve higher recognition performance and overcome the challenge of the absence of natural delimiters in Chinese sentences, we propose a Named Entity Recognition (NER) model named Structural-SoftLexicon-Bi-LSTM-CRF (SSBC) empowered by pre-trained model BERT. In particular, we pre-trained a lexicon embedding over large-scale medical corpus to better leverage domain-specific structural knowledge. With input additionally augmented by BERT, rich multigranular information and structural term information is transferred from Structural-SoftLexicon to downstream model Bi-LSTM-CRF. Therefore, we could get a global optimal prediction of input sequence. We evaluate our model on a self-built corpus and results show that SSBC with pre-trained model outperforms other state-of-the-art benchmarks, surpassing at most 3.77% in F1 score. This study hopefully would benefit Diagnostic Related Groups (DRGs) and Diagnosis Intervention Package (DIP) grouping system, medical records statistics and analysis, Medicare payment system, etc.
Collapse
|