1
|
Rao A, Mu A, Enichen E, Gupta D, Hall N, Koranteng E, Marks W, Senter-Zapata MJ, Whitehead DC, White BA, Saini S, Landman AB, Succi MD. A Future of Self-Directed Patient Internet Research: Large Language Model-Based Tools Versus Standard Search Engines. Ann Biomed Eng 2025; 53:1199-1208. [PMID: 40025252 DOI: 10.1007/s10439-025-03701-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2024] [Accepted: 02/22/2025] [Indexed: 03/04/2025]
Abstract
PURPOSE As generalist large language models (LLMs) become more commonplace, patients will inevitably increasingly turn to these tools instead of traditional search engines. Here, we evaluate publicly available LLM-based chatbots as tools for patient education through physician review of responses provided by Google, Bard, GPT-3.5 and GPT-4 to commonly searched queries about prevalent chronic health conditions in the United States. METHODS Five distinct commonly Google-searched queries were selected for (i) hypertension, (ii) hyperlipidemia, (iii) diabetes, (iv) anxiety, and (v) mood disorders and prompted into each model of interest. Responses were assessed by board-certified physicians for accuracy, comprehensiveness, and overall quality on a five-point Likert scale. The Flesch-Kincaid Grade Levels were calculated to assess readability. RESULTS GPT-3.5 (4.40 ± 0.48, 4.29 ± 0.43) and GPT-4 (4.35 ± 0.30, 4.24 ± 0.28) received higher ratings in comprehensiveness and quality than Bard (3.79 ± 0.36, 3.87 ± 0.32) and Google (1.87 ± 0.42, 2.11 ± 0.47), all p < 0.05. However, Bard (9.45 ± 1.35) and Google responses (9.92 ± 5.31) had a lower average Flesch-Kincaid Grade Level compared to GPT-3.5 (14.69 ± 1.57) and GPT-4 (12.88 ± 2.02), indicating greater readability. CONCLUSION This study suggests that publicly available LLM-based tools may provide patients with more accurate responses to queries on chronic health conditions than answers provided by Google search. These results provide support for the use of these tools in place of traditional search engines for health-related queries.
Collapse
Affiliation(s)
- Arya Rao
- Harvard Medical School, Boston, MA, USA
- Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center (MESH IO), Massachusetts General Hospital, Boston, MA, USA
- Department of Radiology, Massachusetts General Hospital, 55 Fruit Street, Boston, MA, 02114, USA
| | - Andrew Mu
- Harvard Medical School, Boston, MA, USA
- Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center (MESH IO), Massachusetts General Hospital, Boston, MA, USA
- Department of Radiology, Massachusetts General Hospital, 55 Fruit Street, Boston, MA, 02114, USA
| | - Elizabeth Enichen
- Harvard Medical School, Boston, MA, USA
- Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center (MESH IO), Massachusetts General Hospital, Boston, MA, USA
- Department of Radiology, Massachusetts General Hospital, 55 Fruit Street, Boston, MA, 02114, USA
| | - Dhruva Gupta
- Harvard Medical School, Boston, MA, USA
- Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center (MESH IO), Massachusetts General Hospital, Boston, MA, USA
- Department of Radiology, Massachusetts General Hospital, 55 Fruit Street, Boston, MA, 02114, USA
| | - Nathan Hall
- Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center (MESH IO), Massachusetts General Hospital, Boston, MA, USA
| | - Erica Koranteng
- Harvard Medical School, Boston, MA, USA
- Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center (MESH IO), Massachusetts General Hospital, Boston, MA, USA
- Department of Radiology, Massachusetts General Hospital, 55 Fruit Street, Boston, MA, 02114, USA
| | - William Marks
- Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center (MESH IO), Massachusetts General Hospital, Boston, MA, USA
- Harvard Business School, Boston, MA, USA
| | - Michael J Senter-Zapata
- Harvard Medical School, Boston, MA, USA
- Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center (MESH IO), Massachusetts General Hospital, Boston, MA, USA
- Department of Radiology, Massachusetts General Hospital, 55 Fruit Street, Boston, MA, 02114, USA
- Mass General Brigham, Boston, MA, USA
| | - David C Whitehead
- Harvard Medical School, Boston, MA, USA
- Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center (MESH IO), Massachusetts General Hospital, Boston, MA, USA
- Department of Radiology, Massachusetts General Hospital, 55 Fruit Street, Boston, MA, 02114, USA
| | - Benjamin A White
- Harvard Medical School, Boston, MA, USA
- Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center (MESH IO), Massachusetts General Hospital, Boston, MA, USA
- Department of Radiology, Massachusetts General Hospital, 55 Fruit Street, Boston, MA, 02114, USA
| | - Sanjay Saini
- Harvard Medical School, Boston, MA, USA
- Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center (MESH IO), Massachusetts General Hospital, Boston, MA, USA
- Department of Radiology, Massachusetts General Hospital, 55 Fruit Street, Boston, MA, 02114, USA
| | - Adam B Landman
- Harvard Medical School, Boston, MA, USA
- Mass General Brigham, Boston, MA, USA
| | - Marc D Succi
- Harvard Medical School, Boston, MA, USA.
- Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center (MESH IO), Massachusetts General Hospital, Boston, MA, USA.
- Department of Radiology, Massachusetts General Hospital, 55 Fruit Street, Boston, MA, 02114, USA.
- Mass General Brigham, Boston, MA, USA.
| |
Collapse
|
2
|
Dennstädt F, Hastings J, Putora PM, Schmerder M, Cihoric N. Implementing large language models in healthcare while balancing control, collaboration, costs and security. NPJ Digit Med 2025; 8:143. [PMID: 40050366 PMCID: PMC11885444 DOI: 10.1038/s41746-025-01476-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2024] [Accepted: 01/22/2025] [Indexed: 03/09/2025] Open
Affiliation(s)
- Fabio Dennstädt
- Department of Radiation Oncology, Inselspital, Bern University Hospital and University of Bern, Bern, Switzerland.
| | - Janna Hastings
- School of Medicine, University of St. Gallen, St. Gallen, Switzerland
- Institute for Implementation Science in Health Care, University of Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Paul Martin Putora
- Department of Radiation Oncology, Inselspital, Bern University Hospital and University of Bern, Bern, Switzerland
- Department of Radiation Oncology, Kantonsspital St. Gallen, St. Gallen, Switzerland
| | - Max Schmerder
- Department of Radiation Oncology, Inselspital, Bern University Hospital and University of Bern, Bern, Switzerland
| | - Nikola Cihoric
- Department of Radiation Oncology, Inselspital, Bern University Hospital and University of Bern, Bern, Switzerland
| |
Collapse
|
4
|
Young CC, Enichen E, Rivera C, Auger CA, Grant N, Rao A, Succi MD. Diagnostic Accuracy of a Custom Large Language Model on Rare Pediatric Disease Case Reports. Am J Med Genet A 2025; 197:e63878. [PMID: 39268988 DOI: 10.1002/ajmg.a.63878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2024] [Revised: 08/10/2024] [Accepted: 08/29/2024] [Indexed: 09/15/2024]
Abstract
Accurately diagnosing rare pediatric diseases frequently represent a clinical challenge due to their complex and unusual clinical presentations. Here, we explore the capabilities of three large language models (LLMs), GPT-4, Gemini Pro, and a custom-built LLM (GPT-4 integrated with the Human Phenotype Ontology [GPT-4 HPO]), by evaluating their diagnostic performance on 61 rare pediatric disease case reports. The performance of the LLMs were assessed for accuracy in identifying specific diagnoses, listing the correct diagnosis among a differential list, and broad disease categories. In addition, GPT-4 HPO was tested on 100 general pediatrics case reports previously assessed on other LLMs to further validate its performance. The results indicated that GPT-4 was able to predict the correct diagnosis with a diagnostic accuracy of 13.1%, whereas both GPT-4 HPO and Gemini Pro had diagnostic accuracies of 8.2%. Further, GPT-4 HPO showed an improved performance compared with the other two LLMs in identifying the correct diagnosis among its differential list and the broad disease category. Although these findings underscore the potential of LLMs for diagnostic support, particularly when enhanced with domain-specific ontologies, they also stress the need for further improvement prior to integration into clinical practice.
Collapse
Affiliation(s)
- Cameron C Young
- Harvard Medical School, Boston, Massachusetts, USA
- Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center, Mass General Brigham, Boston, Massachusetts, USA
| | - Ellie Enichen
- Harvard Medical School, Boston, Massachusetts, USA
- Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center, Mass General Brigham, Boston, Massachusetts, USA
| | - Christian Rivera
- Harvard Medical School, Boston, Massachusetts, USA
- Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center, Mass General Brigham, Boston, Massachusetts, USA
| | - Corinne A Auger
- Harvard Medical School, Boston, Massachusetts, USA
- Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center, Mass General Brigham, Boston, Massachusetts, USA
| | - Nathan Grant
- Harvard Medical School, Boston, Massachusetts, USA
- Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center, Mass General Brigham, Boston, Massachusetts, USA
| | - Arya Rao
- Harvard Medical School, Boston, Massachusetts, USA
- Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center, Mass General Brigham, Boston, Massachusetts, USA
| | - Marc D Succi
- Medically Engineered Solutions in Healthcare Incubator, Innovation in Operations Research Center, Mass General Brigham, Boston, Massachusetts, USA
- Department of Radiology, Massachusetts General Hospital, Boston, Massachusetts, USA
| |
Collapse
|