1
|
Luo D, Liu M, Yu R, Liu Y, Jiang W, Fan Q, Kuang N, Gao Q, Yin T, Zheng Z. Evaluating the performance of GPT-3.5, GPT-4, and GPT-4o in the Chinese National Medical Licensing Examination. Sci Rep 2025; 15:14119. [PMID: 40269046 PMCID: PMC12018924 DOI: 10.1038/s41598-025-98949-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2025] [Accepted: 04/15/2025] [Indexed: 04/25/2025] Open
Abstract
This study aims to compare and evaluate the performance of GPT-3.5, GPT-4, and GPT-4o in the 2020 and 2021 Chinese National Medical Licensing Examination (NMLE), exploring their potential value in medical education and clinical applications. Six hundred original test questions from the 2020 and 2021 NMLE (covering five types of questions) were selected and input into GPT-3.5, GPT-4, and GPT-4o for response. The accuracy of the models across different question types and units was recorded and analyzed. Statistical methods were employed to compare the performance differences among the three models. GPT-4o demonstrated significantly higher overall accuracy than GPT-4 and GPT-3.5 (P < 0.001). In the 2020 and 2021 exams, GPT-4o achieved accuracy rates of 84.2% and 88.2%, respectively, with the highest accuracy observed in questions related to the digestive system (Unit 3), reaching 94.75%. GPT-4 showed moderate performance, while GPT - 3.5 had the lowest accuracy. Additionally, GPT-4o exhibited a clear advantage in complex question formats, such as case analysis questions (A3/A4 type) and standard matching questions (B1 type). GPT-4o outperformed its predecessors in the NMLE, demonstrating exceptional comprehension and problem-solving abilities in non-English medical examinations. This study provides important insights into the application and promotion of generative AI in medical education and clinical practice.
Collapse
Affiliation(s)
- Dingyuan Luo
- Department of Rehabilitation Medicine Center, Affiliated Tai'an Central Hospital, Qingdao University, No. 29, Longtan Road, Taishan District, Tai'an City, 271000, Shandong, China
| | - Mengke Liu
- Department of Radiology, Affiliated Shandong Provincial Hospital, Shandong First Medical University, Jinan, 250021, Shandong, China
| | - Runyuan Yu
- Department of Rehabilitation Medicine Center, Affiliated Tai'an Central Hospital, Qingdao University, No. 29, Longtan Road, Taishan District, Tai'an City, 271000, Shandong, China
| | - Yulian Liu
- Department of Rehabilitation Medicine Center, Affiliated Tai'an Central Hospital, Qingdao University, No. 29, Longtan Road, Taishan District, Tai'an City, 271000, Shandong, China
| | - Wenjun Jiang
- Department of Rehabilitation Medicine Center, Affiliated Tai'an Central Hospital, Qingdao University, No. 29, Longtan Road, Taishan District, Tai'an City, 271000, Shandong, China
| | - Qi Fan
- Department of Rehabilitation Medicine Center, Affiliated Tai'an Central Hospital, Qingdao University, No. 29, Longtan Road, Taishan District, Tai'an City, 271000, Shandong, China
| | - Naifeng Kuang
- Department of Rehabilitation Medicine Center, Affiliated Tai'an Central Hospital, Qingdao University, No. 29, Longtan Road, Taishan District, Tai'an City, 271000, Shandong, China
| | - Qiang Gao
- Department of Rehabilitation Medicine Center, Affiliated Tai'an Central Hospital, Qingdao University, No. 29, Longtan Road, Taishan District, Tai'an City, 271000, Shandong, China
| | - Tao Yin
- Department of Rehabilitation Medicine Center, Affiliated Tai'an Central Hospital, Qingdao University, No. 29, Longtan Road, Taishan District, Tai'an City, 271000, Shandong, China.
| | - Zuncheng Zheng
- Department of Rehabilitation Medicine Center, Affiliated Tai'an Central Hospital, Qingdao University, No. 29, Longtan Road, Taishan District, Tai'an City, 271000, Shandong, China.
| |
Collapse
|
2
|
Fathima M, Moulana M. Revolutionizing Breast Cancer Care: AI-Enhanced Diagnosis and Patient History. Comput Methods Biomech Biomed Engin 2025; 28:642-654. [PMID: 38178694 DOI: 10.1080/10255842.2023.2300681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 11/23/2023] [Accepted: 12/20/2023] [Indexed: 01/06/2024]
Abstract
Breast cancer poses a significant global health challenge, demanding enhanced diagnostic accuracy and streamlined medical history documentation. This study presents a holistic approach that harnesses the power of artificial intelligence (AI) and machine learning (ML) to address these pressing needs. This study presents a comprehensive methodology for breast cancer diagnosis and medical history generation, integrating data collection, feature extraction, machine learning, and AI-driven history-taking. The research employs a systematic approach to ensure accurate diagnosis and efficient history collection. Data preprocessing merges similar attributes to streamline analysis. Three key algorithms, Support Vector Machine (SVM), K-Nearest Neighbours (KNN), and Fuzzy Logic, are applied. Fuzzy Logic shows exceptional accuracy in handling uncertain data. Deep learning models enhance predictive accuracy, emphasizing the synergy between traditional and deep learning approaches. The AI-driven history collection simplifies the patient history-taking process, adapting questions dynamically based on patient responses. Comprehensive medical history reports summarize patient data, facilitating informed healthcare decisions. The research prioritizes ethical compliance and data privacy. OpenAI has integrated GPT-3.5 to generate automated patient reports, offering structured overviews of patient health history. The study's results indicate the potential for enhanced disease prediction accuracy and streamlined medical history collection, contributing to more reliable healthcare assessments and patient care. Machine learning, deep learning, and AI-driven approaches hold promise for a wide range of applications, particularly in healthcare and beyond.
Collapse
Affiliation(s)
- Maleeha Fathima
- Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Guntur, Andhra Pradesh, India
| | - Mohammed Moulana
- Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Guntur, Andhra Pradesh, India
| |
Collapse
|
3
|
Wu C, Chen L, Han M, Li Z, Yang N, Yu C. Application of ChatGPT-based blended medical teaching in clinical education of hepatobiliary surgery. MEDICAL TEACHER 2025; 47:445-449. [PMID: 38614458 DOI: 10.1080/0142159x.2024.2339412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 04/02/2024] [Indexed: 04/15/2024]
Abstract
OBJECTIVE This study evaluates the effectiveness of incorporating the Chat Generative Pre-trained Transformer (ChatGPT) into the clinical teaching of hepatobiliary surgery for undergraduate medical students. MATERIALS AND METHODS A group of 61 medical undergraduates from the Affiliated Hospital of Guizhou Medical University, undergoing hepatobiliary surgery training, were randomly assigned to either an experimental group (31 students) using ChatGPT-based blended teaching or a control group (30 students) with traditional teaching methods. The evaluation metrics included final exam scores, teaching satisfaction, and teaching effectiveness ratings, analyzed using SPSS 26.0 (SPSS Inc., Chicago, IL) with t-tests and χ2 tests. RESULTS The experimental group significantly outperformed the control group in final exam theoretical scores (86.44 ± 5.59 vs. 77.86 ± 4.16, p < .001) and clinical skills scores (83.84 ± 6.13 vs. 79.12 ± 4.27, p = .001). Additionally, the experimental group reported higher teaching satisfaction (17.23 ± 1.33) and self-evaluation of teaching effectiveness (9.14 ± 0.54) compared to the control group (15.38 ± 1.5 and 8.46 ± 0.70, respectively, p < .001). CONCLUSIONS The integration of ChatGPT into hepatobiliary surgery education significantly enhances theoretical knowledge, clinical skills, and overall satisfaction among medical undergraduates, suggesting a beneficial impact on their educational development.
Collapse
Affiliation(s)
- Changhao Wu
- Department of Hepatobiliary Surgery, The Affiliated Hospital of Guizhou Medical University, Guizhou Medical University, Guiyang, China
- Department of Surgery, Guizhou Medical University, Guiyang, China
- College of Clinical Medicine, Guizhou Medical University, Guiyang, China
- Guizhou Provincial Institute of Hepatobiliary, Pancreatic and Splenic Diseases, Guiyang, China
| | - Liwen Chen
- Department of Hepatobiliary Surgery, The Affiliated Hospital of Guizhou Medical University, Guizhou Medical University, Guiyang, China
- Department of Surgery, Guizhou Medical University, Guiyang, China
- College of Clinical Medicine, Guizhou Medical University, Guiyang, China
- Guizhou Provincial Institute of Hepatobiliary, Pancreatic and Splenic Diseases, Guiyang, China
| | - Min Han
- Department of Hepatobiliary Surgery, The Affiliated Hospital of Guizhou Medical University, Guizhou Medical University, Guiyang, China
- Department of Surgery, Guizhou Medical University, Guiyang, China
- College of Clinical Medicine, Guizhou Medical University, Guiyang, China
- Guizhou Provincial Institute of Hepatobiliary, Pancreatic and Splenic Diseases, Guiyang, China
| | - Zhu Li
- Department of Hepatobiliary Surgery, The Affiliated Hospital of Guizhou Medical University, Guizhou Medical University, Guiyang, China
- Department of Surgery, Guizhou Medical University, Guiyang, China
- College of Clinical Medicine, Guizhou Medical University, Guiyang, China
- Guizhou Provincial Institute of Hepatobiliary, Pancreatic and Splenic Diseases, Guiyang, China
| | - Nenghong Yang
- Department of Hepatobiliary Surgery, The Affiliated Hospital of Guizhou Medical University, Guizhou Medical University, Guiyang, China
- Department of Surgery, Guizhou Medical University, Guiyang, China
- College of Clinical Medicine, Guizhou Medical University, Guiyang, China
- Guizhou Provincial Institute of Hepatobiliary, Pancreatic and Splenic Diseases, Guiyang, China
| | - Chao Yu
- Department of Hepatobiliary Surgery, The Affiliated Hospital of Guizhou Medical University, Guizhou Medical University, Guiyang, China
- Department of Surgery, Guizhou Medical University, Guiyang, China
- College of Clinical Medicine, Guizhou Medical University, Guiyang, China
- Guizhou Provincial Institute of Hepatobiliary, Pancreatic and Splenic Diseases, Guiyang, China
| |
Collapse
|
4
|
Hong EK, Ham J, Roh B, Gu J, Park B, Kang S, You K, Eom J, Bae B, Jo JB, Song OK, Bae W, Lee RW, Suh CH, Park CH, Choi SJ, Park JS, Park JH, Jeon HJ, Hong JH, Cho D, Choi HS, Kim TH. Diagnostic Accuracy and Clinical Value of a Domain-specific Multimodal Generative AI Model for Chest Radiograph Report Generation. Radiology 2025; 314:e241476. [PMID: 40131111 DOI: 10.1148/radiol.241476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2025]
Abstract
Background Generative artificial intelligence (AI) is anticipated to alter radiology workflows, requiring a clinical value assessment for frequent examinations like chest radiograph interpretation. Purpose To develop and evaluate the diagnostic accuracy and clinical value of a domain-specific multimodal generative AI model for providing preliminary interpretations of chest radiographs. Materials and Methods For training, consecutive radiograph-report pairs from frontal chest radiography were retrospectively collected from 42 hospitals (2005-2023). The trained domain-specific AI model generated radiology reports for the radiographs. The test set included public datasets (PadChest, Open-i, VinDr-CXR, and MIMIC-CXR-JPG) and radiographs excluded from training. The sensitivity and specificity of the model-generated reports for 13 radiographic findings, compared with radiologist annotations (reference standard), were calculated (with 95% CIs). Four radiologists evaluated the subjective quality of the reports in terms of acceptability, agreement score, quality score, and comparative ranking of reports from (a) the domain-specific AI model, (b) radiologists, and (c) a general-purpose large language model (GPT-4Vision). Acceptability was defined as whether the radiologist would endorse the report as their own without changes. Agreement scores from 1 (clinically significant discrepancy) to 5 (complete agreement) were assigned using RADPEER; quality scores were on a 5-point Likert scale from 1 (very poor) to 5 (excellent). Results A total of 8 838 719 radiograph-report pairs (training) and 2145 radiographs (testing) were included (anonymized with respect to sex and gender). Reports generated by the domain-specific AI model demonstrated high sensitivity for detecting two critical radiographic findings: 95.3% (181 of 190) for pneumothorax and 92.6% (138 of 149) for subcutaneous emphysema. Acceptance rate, evaluated by four radiologists, was 70.5% (6047 of 8680), 73.3% (6288 of 8580), and 29.6% (2536 of 8580) for model-generated, radiologist, and GPT-4Vision reports, respectively. Agreement scores were highest for the model-generated reports (median = 4 [IQR, 3-5]) and lowest for GPT-4Vision reports (median = 1 [IQR, 1-3]; P < .001). Quality scores were also highest for the model-generated reports (median = 4 [IQR, 3-5]) and lowest for the GPT-4Vision reports (median = 2 [IQR, 1-3]; P < .001). From the ranking analysis, model-generated reports were most frequently ranked the highest (60.0%; 5146 of 8580), and GPT-4Vision reports were most frequently ranked the lowest (73.6%; 6312 of 8580). Conclusion A domain-specific multimodal generative AI model demonstrated potential for high diagnostic accuracy and clinical value in providing preliminary interpretations of chest radiographs for radiologists. © RSNA, 2025 Supplemental material is available for this article. See also the editorial by Little in this issue.
Collapse
Affiliation(s)
- Eun Kyoung Hong
- Department of Radiology, Brigham & Women's Hospital, 75 Francis St, Boston, MA 02215
| | | | | | | | | | | | | | | | | | | | | | | | | | | | - Chan Ho Park
- College of Medicine, Soonchunhyang University, Cheonan, South Korea
| | - Seong Jun Choi
- College of Medicine, Soonchunhyang University, Cheonan, South Korea
| | - Jai Soung Park
- College of Medicine, Soonchunhyang University, Cheonan, South Korea
| | - Jae-Hyeong Park
- College of Medicine, Chungnam National University, Daejun, South Korea
| | - Hyun Jeong Jeon
- College of Medicine, Chungbuk National University, Cheongju, South Korea
| | - Jeong-Ho Hong
- School of Medicine, Keimyung University, Daegu, South Korea
| | - Dosang Cho
- College of Medicine, Ewha Womans University, Seoul, South Korea
| | - Han Seok Choi
- College of Medicine, Dongguk University, Goyang, South Korea
| | - Tae Hee Kim
- School of Medicine, Ajou University, Suwon, South Korea
| |
Collapse
|
5
|
Pugliese N, Bertazzoni A, Hassan C, Schattenberg JM, Aghemo A. Revolutionizing MASLD: How Artificial Intelligence Is Shaping the Future of Liver Care. Cancers (Basel) 2025; 17:722. [PMID: 40075570 PMCID: PMC11899536 DOI: 10.3390/cancers17050722] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2025] [Revised: 02/08/2025] [Accepted: 02/17/2025] [Indexed: 03/14/2025] Open
Abstract
Metabolic dysfunction-associated steatotic liver disease (MASLD) is emerging as a leading cause of chronic liver disease. In recent years, artificial intelligence (AI) has attracted significant attention in healthcare, particularly in diagnostics, patient management, and drug development, demonstrating immense potential for application and implementation. In the field of MASLD, substantial research has explored the application of AI in various areas, including patient counseling, improved patient stratification, enhanced diagnostic accuracy, drug development, and prognosis prediction. However, the integration of AI in hepatology is not without challenges. Key issues include data management and privacy, algorithmic bias, and the risk of AI-generated inaccuracies, commonly referred to as "hallucinations". This review aims to provide a comprehensive overview of the applications of AI in hepatology, with a focus on MASLD, highlighting both its transformative potential and its inherent limitations.
Collapse
Affiliation(s)
- Nicola Pugliese
- Department of Biomedical Sciences, Humanitas University, 20072 Pieve Emanuele, MI, Italy; (N.P.); (A.B.); (C.H.)
- Division of Internal Medicine and Hepatology, Department of Gastroenterology, IRCCS Humanitas Research Hospital, 20089 Rozzano, MI, Italy
| | - Arianna Bertazzoni
- Department of Biomedical Sciences, Humanitas University, 20072 Pieve Emanuele, MI, Italy; (N.P.); (A.B.); (C.H.)
- Division of Internal Medicine and Hepatology, Department of Gastroenterology, IRCCS Humanitas Research Hospital, 20089 Rozzano, MI, Italy
| | - Cesare Hassan
- Department of Biomedical Sciences, Humanitas University, 20072 Pieve Emanuele, MI, Italy; (N.P.); (A.B.); (C.H.)
- Endoscopy Unit, Department of Gastroenterology, IRCCS Humanitas Research Hospital, 20089 Rozzano, MI, Italy
| | - Jörn M. Schattenberg
- Department of Internal Medicine II, Saarland University Medical Center, 66421 Homburg, Germany;
| | - Alessio Aghemo
- Department of Biomedical Sciences, Humanitas University, 20072 Pieve Emanuele, MI, Italy; (N.P.); (A.B.); (C.H.)
- Division of Internal Medicine and Hepatology, Department of Gastroenterology, IRCCS Humanitas Research Hospital, 20089 Rozzano, MI, Italy
| |
Collapse
|
6
|
Ma X, Huang T, Chen X, Li Q, Liao M, Fu L, Huang J, Yuan K, Wang Z, Zeng Y. Molecular mechanisms in liver repair and regeneration: from physiology to therapeutics. Signal Transduct Target Ther 2025; 10:63. [PMID: 39920130 PMCID: PMC11806117 DOI: 10.1038/s41392-024-02104-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 09/02/2024] [Accepted: 12/12/2024] [Indexed: 02/09/2025] Open
Abstract
Liver repair and regeneration are crucial physiological responses to hepatic injury and are orchestrated through intricate cellular and molecular networks. This review systematically delineates advancements in the field, emphasizing the essential roles played by diverse liver cell types. Their coordinated actions, supported by complex crosstalk within the liver microenvironment, are pivotal to enhancing regenerative outcomes. Recent molecular investigations have elucidated key signaling pathways involved in liver injury and regeneration. Viewed through the lens of metabolic reprogramming, these pathways highlight how shifts in glucose, lipid, and amino acid metabolism support the cellular functions essential for liver repair and regeneration. An analysis of regenerative variability across pathological states reveals how disease conditions influence these dynamics, guiding the development of novel therapeutic strategies and advanced techniques to enhance liver repair and regeneration. Bridging laboratory findings with practical applications, recent clinical trials highlight the potential of optimizing liver regeneration strategies. These trials offer valuable insights into the effectiveness of novel therapies and underscore significant progress in translational research. In conclusion, this review intricately links molecular insights to therapeutic frontiers, systematically charting the trajectory from fundamental physiological mechanisms to innovative clinical applications in liver repair and regeneration.
Collapse
Affiliation(s)
- Xiao Ma
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China
| | - Tengda Huang
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China
| | - Xiangzheng Chen
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China
| | - Qian Li
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China
| | - Mingheng Liao
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China
| | - Li Fu
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China
| | - Jiwei Huang
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China
| | - Kefei Yuan
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China
| | - Zhen Wang
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China.
| | - Yong Zeng
- Division of Liver Surgery, Department of General Surgery and Laboratory of Liver Surgery, and State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China.
| |
Collapse
|
7
|
Yang X, Li T, Su Q, Liu Y, Kang C, Lyu Y, Zhao L, Nie Y, Pan Y. Application of large language models in disease diagnosis and treatment. Chin Med J (Engl) 2025; 138:130-142. [PMID: 39722188 PMCID: PMC11745858 DOI: 10.1097/cm9.0000000000003456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2024] [Indexed: 12/28/2024] Open
Abstract
ABSTRACT Large language models (LLMs) such as ChatGPT, Claude, Llama, and Qwen are emerging as transformative technologies for the diagnosis and treatment of various diseases. With their exceptional long-context reasoning capabilities, LLMs are proficient in clinically relevant tasks, particularly in medical text analysis and interactive dialogue. They can enhance diagnostic accuracy by processing vast amounts of patient data and medical literature and have demonstrated their utility in diagnosing common diseases and facilitating the identification of rare diseases by recognizing subtle patterns in symptoms and test results. Building on their image-recognition abilities, multimodal LLMs (MLLMs) show promising potential for diagnosis based on radiography, chest computed tomography (CT), electrocardiography (ECG), and common pathological images. These models can also assist in treatment planning by suggesting evidence-based interventions and improving clinical decision support systems through integrated analysis of patient records. Despite these promising developments, significant challenges persist regarding the use of LLMs in medicine, including concerns regarding algorithmic bias, the potential for hallucinations, and the need for rigorous clinical validation. Ethical considerations also underscore the importance of maintaining the function of supervision in clinical practice. This paper highlights the rapid advancements in research on the diagnostic and therapeutic applications of LLMs across different medical disciplines and emphasizes the importance of policymaking, ethical supervision, and multidisciplinary collaboration in promoting more effective and safer clinical applications of LLMs. Future directions include the integration of proprietary clinical knowledge, the investigation of open-source and customized models, and the evaluation of real-time effects in clinical diagnosis and treatment practices.
Collapse
Affiliation(s)
- Xintian Yang
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Fourth Military Medical University, Xi’an, Shaanxi 710032, China
| | - Tongxin Li
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Fourth Military Medical University, Xi’an, Shaanxi 710032, China
| | - Qin Su
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Fourth Military Medical University, Xi’an, Shaanxi 710032, China
| | - Yaling Liu
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Fourth Military Medical University, Xi’an, Shaanxi 710032, China
| | - Chenxi Kang
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Fourth Military Medical University, Xi’an, Shaanxi 710032, China
| | - Yong Lyu
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Fourth Military Medical University, Xi’an, Shaanxi 710032, China
| | - Lina Zhao
- Department of Radiotherapy, Xijing Hospital, Fourth Military Medical University, Xi’an, Shaanxi 710032, China
| | - Yongzhan Nie
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Fourth Military Medical University, Xi’an, Shaanxi 710032, China
| | - Yanglin Pan
- State Key Laboratory of Holistic Integrative Management of Gastrointestinal Cancers and National Clinical Research Center for Digestive Diseases, Xijing Hospital of Digestive Diseases, Fourth Military Medical University, Xi’an, Shaanxi 710032, China
| |
Collapse
|
8
|
Guo Y, Li T, Xie J, Luo M, Zheng C. Evaluating the accuracy, time and cost of GPT-4 and GPT-4o in liver disease diagnoses using cases from "What is Your Diagnosis". J Hepatol 2025; 82:e15-e17. [PMID: 39307371 DOI: 10.1016/j.jhep.2024.09.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/10/2024] [Revised: 09/13/2024] [Accepted: 09/16/2024] [Indexed: 11/09/2024]
Affiliation(s)
- Yusheng Guo
- Department of Radiology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430022, China; Hubei Key Laboratory of Molecular Imaging, Wuhan 430022, China
| | - Tianxiang Li
- Department of Ultrasound, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical. Sciences, Peking Union Medical College, Beijing, 100730, China
| | - Jiao Xie
- Health Management Center, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430022, China
| | - Miao Luo
- Department of Infectious Diseases, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430022, China
| | - Chuansheng Zheng
- Department of Radiology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430022, China; Hubei Key Laboratory of Molecular Imaging, Wuhan 430022, China.
| |
Collapse
|
9
|
Zhou S, Luo X, Chen C, Jiang H, Yang C, Ran G, Yu J, Yin C. The performance of large language model-powered chatbots compared to oncology physicians on colorectal cancer queries. Int J Surg 2024; 110:6509-6517. [PMID: 38935100 PMCID: PMC11487020 DOI: 10.1097/js9.0000000000001850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Accepted: 06/06/2024] [Indexed: 06/28/2024]
Abstract
BACKGROUND Large language model (LLM)-powered chatbots have become increasingly prevalent in healthcare, while their capacity in oncology remains largely unknown. To evaluate the performance of LLM-powered chatbots compared to oncology physicians in addressing colorectal cancer queries. METHODS This study was conducted between August 13, 2023, and January 5, 2024. A total of 150 questions were designed, and each question was submitted three times to eight chatbots: ChatGPT-3.5, ChatGPT-4, ChatGPT-4 Turbo, Doctor GPT, Llama-2-70B, Mixtral-8x7B, Bard, and Claude 2.1. No feedback was provided to these chatbots. The questions were also answered by nine oncology physicians, including three residents, three fellows, and three attendings. Each answer was scored based on its consistency with guidelines, with a score of 1 for consistent answers and 0 for inconsistent answers. The total score for each question was based on the number of corrected answers, ranging from 0 to 3. The accuracy and scores of the chatbots were compared to those of the physicians. RESULTS Claude 2.1 demonstrated the highest accuracy, with an average accuracy of 82.67%, followed by Doctor GPT at 80.45%, ChatGPT-4 Turbo at 78.44%, ChatGPT-4 at 78%, Mixtral-8x7B at 73.33%, Bard at 70%, ChatGPT-3.5 at 64.89%, and Llama-2-70B at 61.78%. Claude 2.1 outperformed residents, fellows, and attendings. Doctor GPT outperformed residents and fellows. Additionally, Mixtral-8x7B outperformed residents. In terms of scores, Claude 2.1 outperformed residents and fellows. Doctor GPT, ChatGPT-4 Turbo, and ChatGPT-4 outperformed residents. CONCLUSIONS This study shows that LLM-powered chatbots can provide more accurate medical information compared to oncology physicians.
Collapse
Affiliation(s)
- Shan Zhou
- Florida Research and Innovation Center, Cleveland Clinic, Port St. Lucie, FL, USA
| | - Xiao Luo
- Department of Radiology, The First Affiliated Hospital of Shenzhen University, Health Science Center, Shenzhen Second People’s Hospital, Shenzhen, China
| | - Chan Chen
- Department of Clinical Laboratory, Shenzhen Baoan Hospital, The Second Affiliated Hospital of Shenzhen University, Shenzhen
| | - Hong Jiang
- Statistical Office, Zhuhai People’s Hospital, Zhuhai Clinical Medical College of Jinan University, Zhuhai
- Faculty of Medicine, Macau University of Science and Technology, Macau, China
| | - Chun Yang
- Department of Radiology, The First Affiliated Hospital of Shenzhen University, Health Science Center, Shenzhen Second People’s Hospital, Shenzhen, China
| | - Guanghui Ran
- Department of Radiology, The First Affiliated Hospital of Shenzhen University, Health Science Center, Shenzhen Second People’s Hospital, Shenzhen, China
| | - Juan Yu
- Department of Radiology, The First Affiliated Hospital of Shenzhen University, Health Science Center, Shenzhen Second People’s Hospital, Shenzhen, China
| | - Chengliang Yin
- Faculty of Medicine, Macau University of Science and Technology, Macau, China
| |
Collapse
|
10
|
Murali M, Wiles MD. Large language models and artificial intelligence: the coming storm for academia. Anaesthesia 2024. [PMID: 39316447 DOI: 10.1111/anae.16441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/09/2024] [Indexed: 09/26/2024]
Affiliation(s)
- Mayur Murali
- Department of Surgery and Cancer, Faculty of Medicine, Imperial College London, Division of Anaesthetics, Pain Medicine and Intensive Care, London, UK
| | - Matthew D Wiles
- Department of Academic Anaesthesia, Sheffield Teaching Hospitals NHS Foundation Trust, Sheffield, UK
- Centre for Applied Health and Social Care Research, Sheffield Hallam University, Sheffield, UK
| |
Collapse
|
11
|
Mese I. Tracing the Footprints of AI in Radiology Literature: A Detailed Analysis of Journal Abstracts. ROFO-FORTSCHR RONTG 2024; 196:843-849. [PMID: 38228155 DOI: 10.1055/a-2224-9230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2024]
Affiliation(s)
- Ismail Mese
- Department of Radiology, Istanbul Erenkoy Mental and Nervous Diseases Training and Research Hospital, Istanbul, Turkey
| |
Collapse
|
12
|
Jo MH, Kim MJ, Oh HK, Choi MJ, Shin HR, Lee TG, Ahn HM, Kim DW, Kang SB. Communicative competence of generative artificial intelligence in responding to patient queries about colorectal cancer surgery. Int J Colorectal Dis 2024; 39:94. [PMID: 38902500 PMCID: PMC11189990 DOI: 10.1007/s00384-024-04670-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/13/2024] [Indexed: 06/22/2024]
Abstract
PURPOSE To examine the ability of generative artificial intelligence (GAI) to answer patients' questions regarding colorectal cancer (CRC). METHODS Ten clinically relevant questions about CRC were selected from top-rated hospitals' websites and patient surveys and presented to three GAI tools (Chatbot Generative Pre-Trained Transformer [GPT-4], Google Bard, and CLOVA X). Their responses were compared with answers from the CRC information book. Response evaluation was performed by two groups, each consisting of five healthcare professionals (HCP) and patients. Each question was scored on a 1-5 Likert scale based on four evaluation criteria (maximum score, 20 points/question). RESULTS In an analysis including only HCPs, the information book scored 11.8 ± 1.2, GPT-4 scored 13.5 ± 1.1, Google Bard scored 11.5 ± 0.7, and CLOVA X scored 12.2 ± 1.4 (P = 0.001). The score of GPT-4 was significantly higher than those of the information book (P = 0.020) and Google Bard (P = 0.001). In an analysis including only patients, the information book scored 14.1 ± 1.4, GPT-4 scored 15.2 ± 1.8, Google Bard scored 15.5 ± 1.8, and CLOVA X scored 14.4 ± 1.8, without significant differences (P = 0.234). When both groups of evaluators were included, the information book scored 13.0 ± 0.9, GPT-4 scored 14.4 ± 1.2, Google Bard scored 13.5 ± 1.0, and CLOVA X scored 13.3 ± 1.5 (P = 0.070). CONCLUSION The three GAIs demonstrated similar or better communicative competence than the information book regarding questions related to CRC surgery in Korean. If high-quality medical information provided by GAI is supervised properly by HCPs and published as an information book, it could be helpful for patients to obtain accurate information and make informed decisions.
Collapse
Affiliation(s)
- Min Hyeong Jo
- Department of Surgery, Seoul National University Bundang Hospital, 300 Gumi-dong Bundang-gu, Seongnam-si, Gyeonggi-do, 13620, South Korea
| | - Min-Jun Kim
- Department of Surgery, Seoul National University College of Medicine, Seoul, South Korea
| | - Heung-Kwon Oh
- Department of Surgery, Seoul National University Bundang Hospital, 300 Gumi-dong Bundang-gu, Seongnam-si, Gyeonggi-do, 13620, South Korea.
- Department of Surgery, Seoul National University College of Medicine, Seoul, South Korea.
| | - Mi Jeong Choi
- Department of Surgery, Seoul National University Bundang Hospital, 300 Gumi-dong Bundang-gu, Seongnam-si, Gyeonggi-do, 13620, South Korea
| | - Hye-Rim Shin
- Department of Surgery, Seoul National University Bundang Hospital, 300 Gumi-dong Bundang-gu, Seongnam-si, Gyeonggi-do, 13620, South Korea
| | - Tae-Gyun Lee
- Department of Surgery, Seoul National University Bundang Hospital, 300 Gumi-dong Bundang-gu, Seongnam-si, Gyeonggi-do, 13620, South Korea
| | - Hong-Min Ahn
- Department of Surgery, Seoul National University Bundang Hospital, 300 Gumi-dong Bundang-gu, Seongnam-si, Gyeonggi-do, 13620, South Korea
| | - Duck-Woo Kim
- Department of Surgery, Seoul National University Bundang Hospital, 300 Gumi-dong Bundang-gu, Seongnam-si, Gyeonggi-do, 13620, South Korea
- Department of Surgery, Seoul National University College of Medicine, Seoul, South Korea
| | - Sung-Bum Kang
- Department of Surgery, Seoul National University Bundang Hospital, 300 Gumi-dong Bundang-gu, Seongnam-si, Gyeonggi-do, 13620, South Korea
- Department of Surgery, Seoul National University College of Medicine, Seoul, South Korea
| |
Collapse
|
13
|
Shukla AK, Terziyan V, Tiihonen T. AI as a user of AI: Towards responsible autonomy. Heliyon 2024; 10:e31397. [PMID: 38947449 PMCID: PMC11214353 DOI: 10.1016/j.heliyon.2024.e31397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Revised: 05/14/2024] [Accepted: 05/15/2024] [Indexed: 07/02/2024] Open
Abstract
Recent advancements in Artificial Intelligence (AI), particularly in generative language models and algorithms, have led to significant impacts across diverse domains. AI capabilities to address prompts are growing beyond human capability but we expect AI to perform well also as a prompt engineer. Additionally, AI can serve as a guardian for ethical, security, and other predefined issues related to generated content. We postulate that enforcing dialogues among AI-as-prompt-engineer, AI-as-prompt-responder, and AI-as-Compliance-Guardian can lead to high-quality and responsible solutions. This paper introduces a novel AI collaboration paradigm emphasizing responsible autonomy, with implications for addressing real-world challenges. The paradigm of responsible AI-AI conversation establishes structured interaction patterns, guaranteeing decision-making autonomy. Key implications include enhanced understanding of AI dialogue flow, compliance with rules and regulations, and decision-making scenarios exemplifying responsible autonomy. Real-world applications envision AI systems autonomously addressing complex challenges. We have made preliminary testing of such a paradigm involving instances of ChatGPT autonomously playing various roles in a set of experimental AI-AI conversations and observed evident added value of such a framework.
Collapse
Affiliation(s)
- Amit K. Shukla
- School of Technology and Innovations, University of Vaasa, Wolffintie 34, FI-65200, Vaasa, Finland
| | - Vagan Terziyan
- Faculty of Information Technology, University of Jyvaskyla, Box 35 (Agora), 40014, Jyvaskyla, Finland
| | - Timo Tiihonen
- Faculty of Information Technology, University of Jyvaskyla, Box 35 (Agora), 40014, Jyvaskyla, Finland
| |
Collapse
|
14
|
Pugliese N, Polverini D, Lombardi R, Pennisi G, Ravaioli F, Armandi A, Buzzetti E, Dalbeni A, Liguori A, Mantovani A, Villani R, Gardini I, Hassan C, Valenti L, Miele L, Petta S, Sebastiani G, Aghemo A, NAFLD Expert Chatbot Working Group. Evaluation of ChatGPT as a Counselling Tool for Italian-Speaking MASLD Patients: Assessment of Accuracy, Completeness and Comprehensibility. J Pers Med 2024; 14:568. [PMID: 38929789 PMCID: PMC11204905 DOI: 10.3390/jpm14060568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Revised: 05/19/2024] [Accepted: 05/21/2024] [Indexed: 06/28/2024] Open
Abstract
BACKGROUND Artificial intelligence (AI)-based chatbots have shown promise in providing counseling to patients with metabolic dysfunction-associated steatotic liver disease (MASLD). While ChatGPT3.5 has demonstrated the ability to comprehensively answer MASLD-related questions in English, its accuracy remains suboptimal. Whether language influences these results is unclear. This study aims to assess ChatGPT's performance as a counseling tool for Italian MASLD patients. METHODS Thirteen Italian experts rated the accuracy, completeness and comprehensibility of ChatGPT3.5 in answering 15 MASLD-related questions in Italian using a six-point accuracy, three-point completeness and three-point comprehensibility Likert's scale. RESULTS Mean scores for accuracy, completeness and comprehensibility were 4.57 ± 0.42, 2.14 ± 0.31 and 2.91 ± 0.07, respectively. The physical activity domain achieved the highest mean scores for accuracy and completeness, whereas the specialist referral domain achieved the lowest. Overall, Fleiss's coefficient of concordance for accuracy, completeness and comprehensibility across all 15 questions was 0.016, 0.075 and -0.010, respectively. Age and academic role of the evaluators did not influence the scores. The results were not significantly different from our previous study focusing on English. CONCLUSION Language does not appear to affect ChatGPT's ability to provide comprehensible and complete counseling to MASLD patients, but accuracy remains suboptimal in certain domains.
Collapse
Affiliation(s)
- Nicola Pugliese
- Department of Biomedical Sciences, Humanitas University, Pieve Emanuele, 20072 Milan, Italy; (N.P.); (D.P.); (C.H.)
- Division of Internal Medicine and Hepatology, Department of Gastroenterology, IRCCS Humanitas Research Hospital, Rozzano, 20089 Milan, Italy
| | - Davide Polverini
- Department of Biomedical Sciences, Humanitas University, Pieve Emanuele, 20072 Milan, Italy; (N.P.); (D.P.); (C.H.)
- Division of Internal Medicine and Hepatology, Department of Gastroenterology, IRCCS Humanitas Research Hospital, Rozzano, 20089 Milan, Italy
| | - Rosa Lombardi
- Unit of Internal Medicine and Metabolic Disease, Fondazione IRCCS Ca’ Granda Ospedale Maggiore Policlinico of Milan, 20122 Milan, Italy;
- Department of Pathophysiology and Transplantation, Università degli Studi di Milano, 20122 Milan, Italy;
| | - Grazia Pennisi
- Section of Gastroenterology and Hepatology, PROMISE, University of Palermo, 90127 Palermo, Italy; (G.P.); (S.P.)
| | - Federico Ravaioli
- Department of Medical and Surgical Sciences (DIMEC), University of Bologna, 40138 Bologna, Italy;
- Division of Internal Medicine, Hepatobiliary and Immunoallergic Diseases, IRCCS Azienda Ospedaliero Universitaria di Bologna, 40138 Bologna, Italy
| | - Angelo Armandi
- Division of Gastroenterology and Hepatology, Department of Medical Sciences, University of Turin, Corso Dogliotti 14, 10126 Turin, Italy;
- Metabolic Liver Disease Research Program, I. Department of Internal Medicine, University Medical Center of Mainz, 55131 Mainz, Germany
| | - Elena Buzzetti
- Internal Medicine and Centre for Hemochromatosis and Hereditary Liver Diseases, ERN-EuroBloodNet Center for Iron Disorders, Azienda Ospedaliero-Universitaria di Modena-Policlinico, 41125 Modena, Italy;
- Department of Medical and Surgical Sciences, Università degli Studi di Modena e Reggio Emilia, 41125 Modena, Italy
| | - Andrea Dalbeni
- Division of General Medicine C, Department of Medicine, University and Azienda Ospedaliera Universitaria Integrata of Verona, University of Verona, 37134 Verona, Italy;
- Liver Unit, Department of Medicine, University and Azienda Ospedaliera Universitaria Integrata of Verona, University of Verona, 37134 Verona, Italy
| | - Antonio Liguori
- DiSMeC—Department of Scienze Mediche e Chirurgiche, Fondazione Policlinico Gemelli IRCCS, 00168 Rome, Italy; (A.L.); (L.M.)
| | - Alessandro Mantovani
- Section of Endocrinology, Diabetes and Metabolism, Department of Medicine, University and Azienda Ospedaliera Universitaria Integrata of Verona, Piazzale Stefani, 37126 Verona, Italy;
| | - Rosanna Villani
- C.U.R.E. (University Center for Liver Disease Research and Treatment), Liver Unit, Department of Medical and Surgical Sciences, University of Foggia, 71122 Foggia, Italy;
| | - Ivan Gardini
- EpaC Onlus, Italian Liver Patient Association, 10141 Turin, Italy;
| | - Cesare Hassan
- Department of Biomedical Sciences, Humanitas University, Pieve Emanuele, 20072 Milan, Italy; (N.P.); (D.P.); (C.H.)
- Division of Gastroenterology and Digestive Endoscopy, Humanitas Research Hospital, IRCCS, Rozzano, 20089 Milan, Italy
| | - Luca Valenti
- Department of Pathophysiology and Transplantation, Università degli Studi di Milano, 20122 Milan, Italy;
- Precision Medicine Lab, Biological Resource Center, Department of Transfusion Medicine, Fondazione IRCCS Ca’ Granda Ospedale Maggiore Policlinico, 20122 Milan, Italy
| | - Luca Miele
- DiSMeC—Department of Scienze Mediche e Chirurgiche, Fondazione Policlinico Gemelli IRCCS, 00168 Rome, Italy; (A.L.); (L.M.)
- Department of Medicina e Chirurgia Traslazionale, Università Cattolica Del Sacro Cuore, 00168 Rome, Italy
| | - Salvatore Petta
- Section of Gastroenterology and Hepatology, PROMISE, University of Palermo, 90127 Palermo, Italy; (G.P.); (S.P.)
| | - Giada Sebastiani
- Division of Gastroenterology and Hepatology, McGill University Health Centre, Montreal, QC H4A 3J1, Canada;
| | - Alessio Aghemo
- Department of Biomedical Sciences, Humanitas University, Pieve Emanuele, 20072 Milan, Italy; (N.P.); (D.P.); (C.H.)
- Division of Internal Medicine and Hepatology, Department of Gastroenterology, IRCCS Humanitas Research Hospital, Rozzano, 20089 Milan, Italy
| | | |
Collapse
|
15
|
Sandmann S, Riepenhausen S, Plagwitz L, Varghese J. Systematic analysis of ChatGPT, Google search and Llama 2 for clinical decision support tasks. Nat Commun 2024; 15:2050. [PMID: 38448475 PMCID: PMC10917796 DOI: 10.1038/s41467-024-46411-8] [Citation(s) in RCA: 48] [Impact Index Per Article: 48.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 02/27/2024] [Indexed: 03/08/2024] Open
Abstract
It is likely that individuals are turning to Large Language Models (LLMs) to seek health advice, much like searching for diagnoses on Google. We evaluate clinical accuracy of GPT-3·5 and GPT-4 for suggesting initial diagnosis, examination steps and treatment of 110 medical cases across diverse clinical disciplines. Moreover, two model configurations of the Llama 2 open source LLMs are assessed in a sub-study. For benchmarking the diagnostic task, we conduct a naïve Google search for comparison. Overall, GPT-4 performed best with superior performances over GPT-3·5 considering diagnosis and examination and superior performance over Google for diagnosis. Except for treatment, better performance on frequent vs rare diseases is evident for all three approaches. The sub-study indicates slightly lower performances for Llama models. In conclusion, the commercial LLMs show growing potential for medical question answering in two successive major releases. However, some weaknesses underscore the need for robust and regulated AI models in health care. Open source LLMs can be a viable option to address specific needs regarding data privacy and transparency of training.
Collapse
Affiliation(s)
- Sarah Sandmann
- Institute of Medical Informatics, University of Münster, Münster, Germany
| | - Sarah Riepenhausen
- Institute of Medical Informatics, University of Münster, Münster, Germany
| | - Lucas Plagwitz
- Institute of Medical Informatics, University of Münster, Münster, Germany
| | - Julian Varghese
- Institute of Medical Informatics, University of Münster, Münster, Germany.
| |
Collapse
|
16
|
Ray PP. Can LLMs improve existing scenario of healthcare? J Hepatol 2024; 80:e28-e29. [PMID: 37595847 DOI: 10.1016/j.jhep.2023.08.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 08/11/2023] [Indexed: 08/20/2023]
|
17
|
Zhang Y, Wu L, Mu Z, Ren L, Chen Y, Liu H, Xu L, Wang Y, Wang Y, Cheng S, Tham YC, Sheng B, Wong TY, Ji H. Letter 2 regarding "Assessing the performance of ChatGPT in answering questions regarding cirrhosis and hepatocellular carcinoma". Clin Mol Hepatol 2024; 30:113-117. [PMID: 37946373 PMCID: PMC10776295 DOI: 10.3350/cmh.2023.0440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Revised: 11/08/2023] [Accepted: 11/10/2023] [Indexed: 11/12/2023] Open
Affiliation(s)
- Yiwen Zhang
- Department of Endocrinology and Metabolic Hepatology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Liwei Wu
- Department of Gastroenterology and Hepatology, Shanghai East Hospital, Tongji University, Shanghai, China
| | - Zepeng Mu
- Department of Endocrinology and Metabolic Hepatology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Linlin Ren
- Department of Gastroenterology and Hepatology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Ying Chen
- Shandong Provincial Key Laboratory of Metabolic Diseases and Qingdao Key Laboratory of Gout, the Affiliated Hospital of Qingdao University, Qingdao, China
| | - Hanyun Liu
- Department of Infectious Disease and Hepatology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Lili Xu
- Department of Endocrinology and Metabolic Hepatology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Yangang Wang
- Department of Endocrinology and Metabolic Hepatology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Yaxing Wang
- Beijing Key Laboratory of Ophthalmology and Visual Sciences, Beijing Tongren Eye Center, Beijing Institute of Ophthalmology, Beijing Tongren Hospital, Capital Medical University, Beijing, China
| | - Susan Cheng
- Department of Cardiology, Smidt Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Yih Chung Tham
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore
- Singapore Eye Research Institute, Singapore National Eye Center, Singapore
| | - Bin Sheng
- Department of Computer Science and Engineering, Shanghai JiaoTong University, Shanghai, China
| | - Tien Yin Wong
- Singapore Eye Research Institute, Singapore National Eye Center, Singapore
- Tsinghua Medicine, Tsinghua University, Beijing, China
| | - Hongwei Ji
- Tsinghua Medicine, Tsinghua University, Beijing, China
- Department of Internal Medicine, Beijing Tsinghua Changgung Hospital, Beijing, China
| |
Collapse
|