1
|
Thakkar V, Silverman GM, Kc A, Ingraham NE, Jones EK, King S, Melton GB, Zhang R, Tignanelli CJ. A comparative analysis of large language models versus traditional information extraction methods for real-world evidence of patient symptomatology in acute and post-acute sequelae of SARS-CoV-2. PLoS One 2025; 20:e0323535. [PMID: 40373001 PMCID: PMC12080813 DOI: 10.1371/journal.pone.0323535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2024] [Accepted: 04/10/2025] [Indexed: 05/17/2025] Open
Abstract
BACKGROUND Patient symptoms, crucial for disease progression and diagnosis, are often captured in unstructured clinical notes. Large language models (LLMs) offer potential advantages in extracting patient symptoms compared to traditional rule-based information extraction (IE) systems. METHODS This study compared fine-tuned LLMs (LLaMA2-13B and LLaMA3-8B) against BioMedICUS, a rule-based IE system, for extracting symptoms related to acute and post-acute sequelae of SARS-CoV-2 from clinical notes. The study utilized three corpora: UMN-COVID, UMN-PASC, and N3C-COVID. Prevalence, keyword and fairness analyses were conducted to assess symptom distribution and model equity across demographics. RESULTS BioMedICUS outperformed fine-tuned LLMs in most cases. On the UMN PASC dataset, BioMedICUS achieved a macro-averaged F1-score of 0.70 for positive mention detection, compared to 0.66 for LLaMA2-13B and 0.62 for LLaMA3-8B. For the N3C COVID dataset, BioMedICUS scored 0.75, while LLaMA2-13B and LLaMA3-8B scored 0.53 and 0.68, respectively for positive mention detection. However, LLMs performed better in specific instances, such as detecting positive mentions of change in sleep in the UMN PASC dataset, where LLaMA2-13B (0.79) and LLaMA3-8B (0.65) outperformed BioMedICUS (0.60). For fairness analysis, BioMedICUS generally showed stronger performance across patient demographics. Keyword analysis using ANOVA on symptom distributions across all three corpora showed that both corpus (df = 2, p < 0.001) and symptom (df = 79, p < 0.001) have a statistically significant effect on log-transformed term frequency-inverse document frequency (TF-IDF) values such that corpus accounts for 52% of the variance in log_tfidf values and symptom accounts for 35%. CONCLUSION While BioMedICUS generally outperformed the LLMs, the latter showed promising results in specific areas, particularly LLaMA3-8B, in identifying negative symptom mentions. However, both LLaMA models faced challenges in demographic fairness and generalizability. These findings underscore the need for diverse, high-quality training datasets and robust annotation processes to enhance LLMs' performance and reliability in clinical applications.
Collapse
Affiliation(s)
- Vedansh Thakkar
- Department of Surgery, University of Minnesota, Minneapolis, Minnesota, United States of America
- Natural Language Processing/Information Extraction Program, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Greg M. Silverman
- Department of Surgery, University of Minnesota, Minneapolis, Minnesota, United States of America
- Natural Language Processing/Information Extraction Program, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Abhinab Kc
- University of Minnesota Medical School, Minneapolis, Minnesota, United States of America
| | - Nicholas E. Ingraham
- Department of Pulmonary, Allergy, Critical Care, and Sleep Medicine, University of Minnesota, Minneapolis, Minnesota, United States of America
- Center for Learning Health Systems Sciences, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Emma K. Jones
- Department of Surgery, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Samantha King
- Department of Surgery, University of Washington, Seattle, Washington, United States of America
| | - Genevieve B. Melton
- Department of Surgery, University of Minnesota, Minneapolis, Minnesota, United States of America
- Natural Language Processing/Information Extraction Program, University of Minnesota, Minneapolis, Minnesota, United States of America
- Center for Learning Health Systems Sciences, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Rui Zhang
- Department of Surgery, University of Minnesota, Minneapolis, Minnesota, United States of America
- Natural Language Processing/Information Extraction Program, University of Minnesota, Minneapolis, Minnesota, United States of America
- Center for Learning Health Systems Sciences, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Christopher J. Tignanelli
- Department of Surgery, University of Minnesota, Minneapolis, Minnesota, United States of America
- Natural Language Processing/Information Extraction Program, University of Minnesota, Minneapolis, Minnesota, United States of America
- Center for Learning Health Systems Sciences, University of Minnesota, Minneapolis, Minnesota, United States of America
| |
Collapse
|
3
|
Sun J, Peng L, Li T, Adila D, Zaiman Z, Melton-Meaux GB, Ingraham NE, Murray E, Boley D, Switzer S, Burns JL, Huang K, Allen T, Steenburg SD, Gichoya JW, Kummerfeld E, Tignanelli CJ. Performance of a Chest Radiograph AI Diagnostic Tool for COVID-19: A Prospective Observational Study. Radiol Artif Intell 2022; 4:e210217. [PMID: 35923381 PMCID: PMC9344211 DOI: 10.1148/ryai.210217] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 03/31/2022] [Accepted: 05/11/2022] [Indexed: 05/27/2023]
Abstract
Purpose To conduct a prospective observational study across 12 U.S. hospitals to evaluate real-time performance of an interpretable artificial intelligence (AI) model to detect COVID-19 on chest radiographs. Materials and Methods A total of 95 363 chest radiographs were included in model training, external validation, and real-time validation. The model was deployed as a clinical decision support system, and performance was prospectively evaluated. There were 5335 total real-time predictions and a COVID-19 prevalence of 4.8% (258 of 5335). Model performance was assessed with use of receiver operating characteristic analysis, precision-recall curves, and F1 score. Logistic regression was used to evaluate the association of race and sex with AI model diagnostic accuracy. To compare model accuracy with the performance of board-certified radiologists, a third dataset of 1638 images was read independently by two radiologists. Results Participants positive for COVID-19 had higher COVID-19 diagnostic scores than participants negative for COVID-19 (median, 0.1 [IQR, 0.0-0.8] vs 0.0 [IQR, 0.0-0.1], respectively; P < .001). Real-time model performance was unchanged over 19 weeks of implementation (area under the receiver operating characteristic curve, 0.70; 95% CI: 0.66, 0.73). Model sensitivity was higher in men than women (P = .01), whereas model specificity was higher in women (P = .001). Sensitivity was higher for Asian (P = .002) and Black (P = .046) participants compared with White participants. The COVID-19 AI diagnostic system had worse accuracy (63.5% correct) compared with radiologist predictions (radiologist 1 = 67.8% correct, radiologist 2 = 68.6% correct; McNemar P < .001 for both). Conclusion AI-based tools have not yet reached full diagnostic potential for COVID-19 and underperform compared with radiologist prediction.Keywords: Diagnosis, Classification, Application Domain, Infection, Lung Supplemental material is available for this article.. © RSNA, 2022.
Collapse
Affiliation(s)
- Ju Sun
- From the Department of Computer Science and Engineering (J.S., L.P.,
T.L., D.A., D.B.), Institute for Health Informatics (G.B.M.M., E.K., C.J.T.),
Department of Surgery (G.B.M.M., C.J.T.), Department of Medicine, Division of
Pulmonary and Critical Care (N.E.I.), Department of Medicine (S.S.), and
Department of Radiology (T.A.), University of Minnesota, 420 Delaware St SE,
Minneapolis, MN 55455; Departments of Computer Science (Z.Z.) and Radiology
(J.W.G.), Emory University, Atlanta, Ga; M Health Fairview Informatics,
Minneapolis, Minn (E.M.); The School of Medicine (J.L.B., K.H.) and Department
of Radiology (S.D.S.), Indiana University, Indianapolis, Ind; and Department of
Surgery, North Memorial Health Hospital, Robbinsdale, Minn (C.J.T.)
| | - Le Peng
- From the Department of Computer Science and Engineering (J.S., L.P.,
T.L., D.A., D.B.), Institute for Health Informatics (G.B.M.M., E.K., C.J.T.),
Department of Surgery (G.B.M.M., C.J.T.), Department of Medicine, Division of
Pulmonary and Critical Care (N.E.I.), Department of Medicine (S.S.), and
Department of Radiology (T.A.), University of Minnesota, 420 Delaware St SE,
Minneapolis, MN 55455; Departments of Computer Science (Z.Z.) and Radiology
(J.W.G.), Emory University, Atlanta, Ga; M Health Fairview Informatics,
Minneapolis, Minn (E.M.); The School of Medicine (J.L.B., K.H.) and Department
of Radiology (S.D.S.), Indiana University, Indianapolis, Ind; and Department of
Surgery, North Memorial Health Hospital, Robbinsdale, Minn (C.J.T.)
| | - Taihui Li
- From the Department of Computer Science and Engineering (J.S., L.P.,
T.L., D.A., D.B.), Institute for Health Informatics (G.B.M.M., E.K., C.J.T.),
Department of Surgery (G.B.M.M., C.J.T.), Department of Medicine, Division of
Pulmonary and Critical Care (N.E.I.), Department of Medicine (S.S.), and
Department of Radiology (T.A.), University of Minnesota, 420 Delaware St SE,
Minneapolis, MN 55455; Departments of Computer Science (Z.Z.) and Radiology
(J.W.G.), Emory University, Atlanta, Ga; M Health Fairview Informatics,
Minneapolis, Minn (E.M.); The School of Medicine (J.L.B., K.H.) and Department
of Radiology (S.D.S.), Indiana University, Indianapolis, Ind; and Department of
Surgery, North Memorial Health Hospital, Robbinsdale, Minn (C.J.T.)
| | - Dyah Adila
- From the Department of Computer Science and Engineering (J.S., L.P.,
T.L., D.A., D.B.), Institute for Health Informatics (G.B.M.M., E.K., C.J.T.),
Department of Surgery (G.B.M.M., C.J.T.), Department of Medicine, Division of
Pulmonary and Critical Care (N.E.I.), Department of Medicine (S.S.), and
Department of Radiology (T.A.), University of Minnesota, 420 Delaware St SE,
Minneapolis, MN 55455; Departments of Computer Science (Z.Z.) and Radiology
(J.W.G.), Emory University, Atlanta, Ga; M Health Fairview Informatics,
Minneapolis, Minn (E.M.); The School of Medicine (J.L.B., K.H.) and Department
of Radiology (S.D.S.), Indiana University, Indianapolis, Ind; and Department of
Surgery, North Memorial Health Hospital, Robbinsdale, Minn (C.J.T.)
| | - Zach Zaiman
- From the Department of Computer Science and Engineering (J.S., L.P.,
T.L., D.A., D.B.), Institute for Health Informatics (G.B.M.M., E.K., C.J.T.),
Department of Surgery (G.B.M.M., C.J.T.), Department of Medicine, Division of
Pulmonary and Critical Care (N.E.I.), Department of Medicine (S.S.), and
Department of Radiology (T.A.), University of Minnesota, 420 Delaware St SE,
Minneapolis, MN 55455; Departments of Computer Science (Z.Z.) and Radiology
(J.W.G.), Emory University, Atlanta, Ga; M Health Fairview Informatics,
Minneapolis, Minn (E.M.); The School of Medicine (J.L.B., K.H.) and Department
of Radiology (S.D.S.), Indiana University, Indianapolis, Ind; and Department of
Surgery, North Memorial Health Hospital, Robbinsdale, Minn (C.J.T.)
| | - Genevieve B. Melton-Meaux
- From the Department of Computer Science and Engineering (J.S., L.P.,
T.L., D.A., D.B.), Institute for Health Informatics (G.B.M.M., E.K., C.J.T.),
Department of Surgery (G.B.M.M., C.J.T.), Department of Medicine, Division of
Pulmonary and Critical Care (N.E.I.), Department of Medicine (S.S.), and
Department of Radiology (T.A.), University of Minnesota, 420 Delaware St SE,
Minneapolis, MN 55455; Departments of Computer Science (Z.Z.) and Radiology
(J.W.G.), Emory University, Atlanta, Ga; M Health Fairview Informatics,
Minneapolis, Minn (E.M.); The School of Medicine (J.L.B., K.H.) and Department
of Radiology (S.D.S.), Indiana University, Indianapolis, Ind; and Department of
Surgery, North Memorial Health Hospital, Robbinsdale, Minn (C.J.T.)
| | - Nicholas E. Ingraham
- From the Department of Computer Science and Engineering (J.S., L.P.,
T.L., D.A., D.B.), Institute for Health Informatics (G.B.M.M., E.K., C.J.T.),
Department of Surgery (G.B.M.M., C.J.T.), Department of Medicine, Division of
Pulmonary and Critical Care (N.E.I.), Department of Medicine (S.S.), and
Department of Radiology (T.A.), University of Minnesota, 420 Delaware St SE,
Minneapolis, MN 55455; Departments of Computer Science (Z.Z.) and Radiology
(J.W.G.), Emory University, Atlanta, Ga; M Health Fairview Informatics,
Minneapolis, Minn (E.M.); The School of Medicine (J.L.B., K.H.) and Department
of Radiology (S.D.S.), Indiana University, Indianapolis, Ind; and Department of
Surgery, North Memorial Health Hospital, Robbinsdale, Minn (C.J.T.)
| | - Eric Murray
- From the Department of Computer Science and Engineering (J.S., L.P.,
T.L., D.A., D.B.), Institute for Health Informatics (G.B.M.M., E.K., C.J.T.),
Department of Surgery (G.B.M.M., C.J.T.), Department of Medicine, Division of
Pulmonary and Critical Care (N.E.I.), Department of Medicine (S.S.), and
Department of Radiology (T.A.), University of Minnesota, 420 Delaware St SE,
Minneapolis, MN 55455; Departments of Computer Science (Z.Z.) and Radiology
(J.W.G.), Emory University, Atlanta, Ga; M Health Fairview Informatics,
Minneapolis, Minn (E.M.); The School of Medicine (J.L.B., K.H.) and Department
of Radiology (S.D.S.), Indiana University, Indianapolis, Ind; and Department of
Surgery, North Memorial Health Hospital, Robbinsdale, Minn (C.J.T.)
| | - Daniel Boley
- From the Department of Computer Science and Engineering (J.S., L.P.,
T.L., D.A., D.B.), Institute for Health Informatics (G.B.M.M., E.K., C.J.T.),
Department of Surgery (G.B.M.M., C.J.T.), Department of Medicine, Division of
Pulmonary and Critical Care (N.E.I.), Department of Medicine (S.S.), and
Department of Radiology (T.A.), University of Minnesota, 420 Delaware St SE,
Minneapolis, MN 55455; Departments of Computer Science (Z.Z.) and Radiology
(J.W.G.), Emory University, Atlanta, Ga; M Health Fairview Informatics,
Minneapolis, Minn (E.M.); The School of Medicine (J.L.B., K.H.) and Department
of Radiology (S.D.S.), Indiana University, Indianapolis, Ind; and Department of
Surgery, North Memorial Health Hospital, Robbinsdale, Minn (C.J.T.)
| | - Sean Switzer
- From the Department of Computer Science and Engineering (J.S., L.P.,
T.L., D.A., D.B.), Institute for Health Informatics (G.B.M.M., E.K., C.J.T.),
Department of Surgery (G.B.M.M., C.J.T.), Department of Medicine, Division of
Pulmonary and Critical Care (N.E.I.), Department of Medicine (S.S.), and
Department of Radiology (T.A.), University of Minnesota, 420 Delaware St SE,
Minneapolis, MN 55455; Departments of Computer Science (Z.Z.) and Radiology
(J.W.G.), Emory University, Atlanta, Ga; M Health Fairview Informatics,
Minneapolis, Minn (E.M.); The School of Medicine (J.L.B., K.H.) and Department
of Radiology (S.D.S.), Indiana University, Indianapolis, Ind; and Department of
Surgery, North Memorial Health Hospital, Robbinsdale, Minn (C.J.T.)
| | - John L. Burns
- From the Department of Computer Science and Engineering (J.S., L.P.,
T.L., D.A., D.B.), Institute for Health Informatics (G.B.M.M., E.K., C.J.T.),
Department of Surgery (G.B.M.M., C.J.T.), Department of Medicine, Division of
Pulmonary and Critical Care (N.E.I.), Department of Medicine (S.S.), and
Department of Radiology (T.A.), University of Minnesota, 420 Delaware St SE,
Minneapolis, MN 55455; Departments of Computer Science (Z.Z.) and Radiology
(J.W.G.), Emory University, Atlanta, Ga; M Health Fairview Informatics,
Minneapolis, Minn (E.M.); The School of Medicine (J.L.B., K.H.) and Department
of Radiology (S.D.S.), Indiana University, Indianapolis, Ind; and Department of
Surgery, North Memorial Health Hospital, Robbinsdale, Minn (C.J.T.)
| | - Kun Huang
- From the Department of Computer Science and Engineering (J.S., L.P.,
T.L., D.A., D.B.), Institute for Health Informatics (G.B.M.M., E.K., C.J.T.),
Department of Surgery (G.B.M.M., C.J.T.), Department of Medicine, Division of
Pulmonary and Critical Care (N.E.I.), Department of Medicine (S.S.), and
Department of Radiology (T.A.), University of Minnesota, 420 Delaware St SE,
Minneapolis, MN 55455; Departments of Computer Science (Z.Z.) and Radiology
(J.W.G.), Emory University, Atlanta, Ga; M Health Fairview Informatics,
Minneapolis, Minn (E.M.); The School of Medicine (J.L.B., K.H.) and Department
of Radiology (S.D.S.), Indiana University, Indianapolis, Ind; and Department of
Surgery, North Memorial Health Hospital, Robbinsdale, Minn (C.J.T.)
| | - Tadashi Allen
- From the Department of Computer Science and Engineering (J.S., L.P.,
T.L., D.A., D.B.), Institute for Health Informatics (G.B.M.M., E.K., C.J.T.),
Department of Surgery (G.B.M.M., C.J.T.), Department of Medicine, Division of
Pulmonary and Critical Care (N.E.I.), Department of Medicine (S.S.), and
Department of Radiology (T.A.), University of Minnesota, 420 Delaware St SE,
Minneapolis, MN 55455; Departments of Computer Science (Z.Z.) and Radiology
(J.W.G.), Emory University, Atlanta, Ga; M Health Fairview Informatics,
Minneapolis, Minn (E.M.); The School of Medicine (J.L.B., K.H.) and Department
of Radiology (S.D.S.), Indiana University, Indianapolis, Ind; and Department of
Surgery, North Memorial Health Hospital, Robbinsdale, Minn (C.J.T.)
| | - Scott D. Steenburg
- From the Department of Computer Science and Engineering (J.S., L.P.,
T.L., D.A., D.B.), Institute for Health Informatics (G.B.M.M., E.K., C.J.T.),
Department of Surgery (G.B.M.M., C.J.T.), Department of Medicine, Division of
Pulmonary and Critical Care (N.E.I.), Department of Medicine (S.S.), and
Department of Radiology (T.A.), University of Minnesota, 420 Delaware St SE,
Minneapolis, MN 55455; Departments of Computer Science (Z.Z.) and Radiology
(J.W.G.), Emory University, Atlanta, Ga; M Health Fairview Informatics,
Minneapolis, Minn (E.M.); The School of Medicine (J.L.B., K.H.) and Department
of Radiology (S.D.S.), Indiana University, Indianapolis, Ind; and Department of
Surgery, North Memorial Health Hospital, Robbinsdale, Minn (C.J.T.)
| | - Judy Wawira Gichoya
- From the Department of Computer Science and Engineering (J.S., L.P.,
T.L., D.A., D.B.), Institute for Health Informatics (G.B.M.M., E.K., C.J.T.),
Department of Surgery (G.B.M.M., C.J.T.), Department of Medicine, Division of
Pulmonary and Critical Care (N.E.I.), Department of Medicine (S.S.), and
Department of Radiology (T.A.), University of Minnesota, 420 Delaware St SE,
Minneapolis, MN 55455; Departments of Computer Science (Z.Z.) and Radiology
(J.W.G.), Emory University, Atlanta, Ga; M Health Fairview Informatics,
Minneapolis, Minn (E.M.); The School of Medicine (J.L.B., K.H.) and Department
of Radiology (S.D.S.), Indiana University, Indianapolis, Ind; and Department of
Surgery, North Memorial Health Hospital, Robbinsdale, Minn (C.J.T.)
| | | | | |
Collapse
|