101
|
Amato I, Simona G. Re: Letter to the Editor: what are the legal and ethical considerations of submitting radiology reports to ChatGPT? Clin Radiol 2024; 79:e982-e983. [PMID: 38719687 DOI: 10.1016/j.crad.2024.04.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Accepted: 04/03/2024] [Indexed: 06/02/2024]
Affiliation(s)
- I Amato
- ARC Advanced Radiology Center (ARC), Department of Oncological Radiotherapy, and Hematology, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy.
| | - G Simona
- ARC Advanced Radiology Center (ARC), Department of Oncological Radiotherapy, and Hematology, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy; Università Cattolica del Sacro Cuore, Facoltà di Medicina e Chirurgia, Rome, Italy.
| |
Collapse
|
102
|
Tessler I, Wolfovitz A, Alon EE, Gecel NA, Livneh N, Zimlichman E, Klang E. ChatGPT's adherence to otolaryngology clinical practice guidelines. Eur Arch Otorhinolaryngol 2024; 281:3829-3834. [PMID: 38647684 DOI: 10.1007/s00405-024-08634-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Accepted: 03/22/2024] [Indexed: 04/25/2024]
Abstract
OBJECTIVES Large language models, including ChatGPT, has the potential to transform the way we approach medical knowledge, yet accuracy in clinical topics is critical. Here we assessed ChatGPT's performance in adhering to the American Academy of Otolaryngology-Head and Neck Surgery guidelines. METHODS We presented ChatGPT with 24 clinical otolaryngology questions based on the guidelines of the American Academy of Otolaryngology. This was done three times (N = 72) to test the model's consistency. Two otolaryngologists evaluated the responses for accuracy and relevance to the guidelines. Cohen's Kappa was used to measure evaluator agreement, and Cronbach's alpha assessed the consistency of ChatGPT's responses. RESULTS The study revealed mixed results; 59.7% (43/72) of ChatGPT's responses were highly accurate, while only 2.8% (2/72) directly contradicted the guidelines. The model showed 100% accuracy in Head and Neck, but lower accuracy in Rhinology and Otology/Neurotology (66%), Laryngology (50%), and Pediatrics (8%). The model's responses were consistent in 17/24 (70.8%), with a Cronbach's alpha value of 0.87, indicating a reasonable consistency across tests. CONCLUSIONS Using a guideline-based set of structured questions, ChatGPT demonstrates consistency but variable accuracy in otolaryngology. Its lower performance in some areas, especially Pediatrics, suggests that further rigorous evaluation is needed before considering real-world clinical use.
Collapse
Affiliation(s)
- Idit Tessler
- Department of Otolaryngology and Head and Neck Surgery, Sheba Medical Center, Ramat Gan, Israel.
- School of Medicine, Tel Aviv University, Tel Aviv, Israel.
- ARC Innovation Center, Sheba Medical Center, Ramat Gan, Israel.
| | - Amit Wolfovitz
- Department of Otolaryngology and Head and Neck Surgery, Sheba Medical Center, Ramat Gan, Israel
- School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Eran E Alon
- Department of Otolaryngology and Head and Neck Surgery, Sheba Medical Center, Ramat Gan, Israel
- School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Nir A Gecel
- School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Nir Livneh
- Department of Otolaryngology and Head and Neck Surgery, Sheba Medical Center, Ramat Gan, Israel
- School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Eyal Zimlichman
- School of Medicine, Tel Aviv University, Tel Aviv, Israel
- ARC Innovation Center, Sheba Medical Center, Ramat Gan, Israel
- The Sheba Talpiot Medical Leadership Program, Ramat Gan, Israel
- Hospital Management, Sheba Medical Center, Ramat Gan, Israel
| | - Eyal Klang
- The Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, USA
| |
Collapse
|
103
|
Xu R, Wang Z. Generative artificial intelligence in healthcare from the perspective of digital media: Applications, opportunities and challenges. Heliyon 2024; 10:e32364. [PMID: 38975200 PMCID: PMC11225727 DOI: 10.1016/j.heliyon.2024.e32364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Accepted: 06/03/2024] [Indexed: 07/09/2024] Open
Abstract
Introduction The emergence and application of generative artificial intelligence/large language models (hereafter GenAI LLMs) have the potential for significant impact on the healthcare industry. However, there is currently a lack of systematic research on GenAI LLMs in healthcare based on reliable data. This article aims to conduct an exploratory study of the application of GenAI LLMs (i.e., ChatGPT) in healthcare from the perspective of digital media (i.e., online news), including the application scenarios, potential opportunities, and challenges. Methods This research used thematic qualitative text analysis in five steps: firstly, developing main topical categories based on relevant articles; secondly, encoding the search keywords using these categories; thirdly, conducting searches for news articles via Google ; fourthly, encoding the sub-categories using the elaborate category system; and finally, conducting category-based analysis and presenting the results. Natural language processing techniques, including the TermRaider and AntConc tool, were applied in the aforementioned steps to assist in text qualitative analysis. Additionally, this study built a framework, using for analyzing the above three topics, from the perspective of five different stakeholders, including healthcare demanders and providers. Results This study summarizes 26 applications (e.g., provide medical advice, provide diagnosis and triage recommendations, provide mental health support, etc.), 21 opportunities (e.g., make healthcare more accessible, reduce healthcare costs, improve patients care, etc.), and 17 challenges (e.g., generate inaccurate/misleading/wrong answers, raise privacy concerns, lack of transparency, etc.), and analyzes the reasons for the formation of these key items and the links between the three research topics. Conclusions The application of GenAI LLMs in healthcare is primarily focused on transforming the way healthcare demanders access medical services (i.e., making it more intelligent, refined, and humane) and optimizing the processes through which healthcare providers offer medical services (i.e., simplifying, ensuring timeliness, and reducing errors). As the application becomes more widespread and deepens, GenAI LLMs is expected to have a revolutionary impact on traditional healthcare service models, but it also inevitably raises ethical and security concerns. Furthermore, GenAI LLMs applied in healthcare is still in the initial stage, which can be accelerated from a specific healthcare field (e.g., mental health) or a specific mechanism (e.g., GenAI LLMs' economic benefits allocation mechanism applied to healthcare) with empirical or clinical research.
Collapse
Affiliation(s)
- Rui Xu
- School of Economics, Guangdong University of Technology, Guangzhou, China
| | - Zhong Wang
- School of Economics, Guangdong University of Technology, Guangzhou, China
- Key Laboratory of Digital Economy and Data Governance, Guangdong University of Technology, Guangzhou, China
| |
Collapse
|
104
|
Wu Y, Wu M, Wang C, Lin J, Liu J, Liu S. Evaluating the Prevalence of Burnout Among Health Care Professionals Related to Electronic Health Record Use: Systematic Review and Meta-Analysis. JMIR Med Inform 2024; 12:e54811. [PMID: 38865188 PMCID: PMC11208837 DOI: 10.2196/54811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 02/23/2024] [Accepted: 04/17/2024] [Indexed: 06/13/2024] Open
Abstract
BACKGROUND Burnout among health care professionals is a significant concern, with detrimental effects on health care service quality and patient outcomes. The use of the electronic health record (EHR) system has been identified as a significant contributor to burnout among health care professionals. OBJECTIVE This systematic review and meta-analysis aims to assess the prevalence of burnout among health care professionals associated with the use of the EHR system, thereby providing evidence to improve health information systems and develop strategies to measure and mitigate burnout. METHODS We conducted a comprehensive search of the PubMed, Embase, and Web of Science databases for English-language peer-reviewed articles published between January 1, 2009, and December 31, 2022. Two independent reviewers applied inclusion and exclusion criteria, and study quality was assessed using the Joanna Briggs Institute checklist and the Newcastle-Ottawa Scale. Meta-analyses were performed using R (version 4.1.3; R Foundation for Statistical Computing), with EndNote X7 (Clarivate) for reference management. RESULTS The review included 32 cross-sectional studies and 5 case-control studies with a total of 66,556 participants, mainly physicians and registered nurses. The pooled prevalence of burnout among health care professionals in cross-sectional studies was 40.4% (95% CI 37.5%-43.2%). Case-control studies indicated a higher likelihood of burnout among health care professionals who spent more time on EHR-related tasks outside work (odds ratio 2.43, 95% CI 2.31-2.57). CONCLUSIONS The findings highlight the association between the increased use of the EHR system and burnout among health care professionals. Potential solutions include optimizing EHR systems, implementing automated dictation or note-taking, employing scribes to reduce documentation burden, and leveraging artificial intelligence to enhance EHR system efficiency and reduce the risk of burnout. TRIAL REGISTRATION PROSPERO International Prospective Register of Systematic Reviews CRD42021281173; https://www.crd.york.ac.uk/prospero/display_record.php?ID=CRD42021281173.
Collapse
Affiliation(s)
- Yuxuan Wu
- Department of Medical Informatics, West China Hospital, Sichuan University, Chengdu, China
| | - Mingyue Wu
- Information Center, West China Hospital, Sichuan University, Chengdu, China
| | - Changyu Wang
- West China College of Stomatology, Sichuan University, Chengdu, China
| | - Jie Lin
- Department of Oral Implantology, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Jialin Liu
- Department of Medical Informatics, West China Hospital, Sichuan University, Chengdu, China
- Information Center, West China Hospital, Sichuan University, Chengdu, China
| | - Siru Liu
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
| |
Collapse
|
105
|
Maggio MG, Tartarisco G, Cardile D, Bonanno M, Bruschetta R, Pignolo L, Pioggia G, Calabrò RS, Cerasa A. Exploring ChatGPT's potential in the clinical stream of neurorehabilitation. Front Artif Intell 2024; 7:1407905. [PMID: 38903157 PMCID: PMC11187276 DOI: 10.3389/frai.2024.1407905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 05/13/2024] [Indexed: 06/22/2024] Open
Abstract
In several medical fields, generative AI tools such as ChatGPT have achieved optimal performance in identifying correct diagnoses only by evaluating narrative clinical descriptions of cases. The most active fields of application include oncology and COVID-19-related symptoms, with preliminary relevant results also in psychiatric and neurological domains. This scoping review aims to introduce the arrival of ChatGPT applications in neurorehabilitation practice, where such AI-driven solutions have the potential to revolutionize patient care and assistance. First, a comprehensive overview of ChatGPT, including its design, and potential applications in medicine is provided. Second, the remarkable natural language processing skills and limitations of these models are examined with a focus on their use in neurorehabilitation. In this context, we present two case scenarios to evaluate ChatGPT ability to resolve higher-order clinical reasoning. Overall, we provide support to the first evidence that generative AI can meaningfully integrate as a facilitator into neurorehabilitation practice, aiding physicians in defining increasingly efficacious diagnostic and personalized prognostic plans.
Collapse
Affiliation(s)
| | - Gennaro Tartarisco
- Institute for Biomedical Research and Innovation (IRIB), National Research Council of Italy (CNR), Messina, Italy
| | | | | | - Roberta Bruschetta
- Institute for Biomedical Research and Innovation (IRIB), National Research Council of Italy (CNR), Messina, Italy
| | | | - Giovanni Pioggia
- Institute for Biomedical Research and Innovation (IRIB), National Research Council of Italy (CNR), Messina, Italy
| | | | - Antonio Cerasa
- Institute for Biomedical Research and Innovation (IRIB), National Research Council of Italy (CNR), Messina, Italy
- S’Anna Institute, Crotone, Italy
- Pharmacotechnology Documentation and Transfer Unit, Preclinical and Translational Pharmacology, Department of Pharmacy, Health and Nutritional Sciences, University of Calabria, Rende, Italy
| |
Collapse
|
106
|
Treviño-Juarez AS. Assessing Risk of Bias Using ChatGPT-4 and Cochrane ROB2 Tool. MEDICAL SCIENCE EDUCATOR 2024; 34:691-694. [PMID: 38887420 PMCID: PMC11180068 DOI: 10.1007/s40670-024-02034-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 03/29/2024] [Indexed: 06/20/2024]
Abstract
In the world of evidence-based medicine, systematic reviews have long been the gold standard. But they have had a problem-they take forever. That is where ChatGPT-4 and automation come in. They are like a breath of fresh air, speeding things up and making the process more reliable. ChatGPT-4 is like having a super-smart assistant who can quickly assess bias risk in research studies. It is a game-changer, especially in a field where getting the latest research quickly can mean life or death for patients. Sure, it is not perfect, and we still need humans to keep an eye on things and ensure everything's ethical. But the future looks bright. With ChatGPT-4 and automation, evidence-based medicine is on the fast track to success.
Collapse
|
107
|
Li J, Tang T, Wu E, Zhao J, Zong H, Wu R, Feng W, Zhang K, Wang D, Qin Y, Shen Z, Qin Y, Ren S, Zhan C, Yang L, Wei Q, Shen B. RARPKB: a knowledge-guide decision support platform for personalized robot-assisted surgery in prostate cancer. Int J Surg 2024; 110:3412-3424. [PMID: 38498357 PMCID: PMC11175739 DOI: 10.1097/js9.0000000000001290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 02/22/2024] [Indexed: 03/20/2024]
Abstract
BACKGROUND Robot-assisted radical prostatectomy (RARP) has emerged as a pivotal surgical intervention for the treatment of prostate cancer (PCa). However, the complexity of clinical cases, heterogeneity of PCa, and limitations in physician expertise pose challenges to rational decision-making in RARP. To address these challenges, the authors aimed to organize the knowledge of previously complex cohorts and establish an online platform named the RARP knowledge base (RARPKB) to provide reference evidence for personalized treatment plans. MATERIALS AND METHODS PubMed searches over the past two decades were conducted to identify publications describing RARP. The authors collected, classified, and structured surgical details, patient information, surgical data, and various statistical results from the literature. A knowledge-guided decision-support tool was established using MySQL, DataTable, ECharts, and JavaScript. ChatGPT-4 and two assessment scales were used to validate and compare the platform. RESULTS The platform comprised 583 studies, 1589 cohorts, 1 911 968 patients, and 11 986 records, resulting in 54 834 data entries. The knowledge-guided decision support tool provide personalized surgical plan recommendations and potential complications on the basis of patients' baseline and surgical information. Compared with ChatGPT-4, RARPKB outperformed in authenticity (100% vs. 73%), matching (100% vs. 53%), personalized recommendations (100% vs. 20%), matching of patients (100% vs. 0%), and personalized recommendations for complications (100% vs. 20%). Postuse, the average System Usability Scale score was 88.88±15.03, and the Net Promoter Score of RARPKB was 85. The knowledge base is available at: http://rarpkb.bioinf.org.cn . CONCLUSIONS The authors introduced the pioneering RARPKB, the first knowledge base for robot-assisted surgery, with an emphasis on PCa. RARPKB can assist in personalized and complex surgical planning for PCa to improve its efficacy. RARPKB provides a reference for the future applications of artificial intelligence in clinical practice.
Collapse
Affiliation(s)
- Jiakun Li
- Department of Urology, West China Hospital, Sichuan University
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University
| | - Tong Tang
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University
- Department of Computer Science and Information Technologies, Elviña Campus, University of A Coruña, A Coruña, Spain
| | - Erman Wu
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University
| | - Jing Zhao
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University
| | - Hui Zong
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University
| | - Rongrong Wu
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University
| | - Weizhe Feng
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University
| | - Ke Zhang
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University
- Chengdu Aixam Medical Technology Co. Ltd, Chengdu
| | - Dongyue Wang
- Department of Ophthalmology, West China Hospital, Sichuan University
| | - Yawen Qin
- Clinical Medical College, Southwest Medical University, Luzhou, Sichuan Province
| | | | - Yi Qin
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University
| | - Shumin Ren
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University
- Department of Computer Science and Information Technologies, Elviña Campus, University of A Coruña, A Coruña, Spain
| | - Chaoying Zhan
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University
| | - Lu Yang
- Department of Urology, West China Hospital, Sichuan University
| | - Qiang Wei
- Department of Urology, West China Hospital, Sichuan University
| | - Bairong Shen
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University
| |
Collapse
|
108
|
Liu S, McCoy AB, Wright AP, Carew B, Genkins JZ, Huang SS, Peterson JF, Steitz B, Wright A. Leveraging large language models for generating responses to patient messages-a subjective analysis. J Am Med Inform Assoc 2024; 31:1367-1379. [PMID: 38497958 PMCID: PMC11105129 DOI: 10.1093/jamia/ocae052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 01/17/2024] [Accepted: 02/28/2024] [Indexed: 03/19/2024] Open
Abstract
OBJECTIVE This study aimed to develop and assess the performance of fine-tuned large language models for generating responses to patient messages sent via an electronic health record patient portal. MATERIALS AND METHODS Utilizing a dataset of messages and responses extracted from the patient portal at a large academic medical center, we developed a model (CLAIR-Short) based on a pre-trained large language model (LLaMA-65B). In addition, we used the OpenAI API to update physician responses from an open-source dataset into a format with informative paragraphs that offered patient education while emphasizing empathy and professionalism. By combining with this dataset, we further fine-tuned our model (CLAIR-Long). To evaluate fine-tuned models, we used 10 representative patient portal questions in primary care to generate responses. We asked primary care physicians to review generated responses from our models and ChatGPT and rated them for empathy, responsiveness, accuracy, and usefulness. RESULTS The dataset consisted of 499 794 pairs of patient messages and corresponding responses from the patient portal, with 5000 patient messages and ChatGPT-updated responses from an online platform. Four primary care physicians participated in the survey. CLAIR-Short exhibited the ability to generate concise responses similar to provider's responses. CLAIR-Long responses provided increased patient educational content compared to CLAIR-Short and were rated similarly to ChatGPT's responses, receiving positive evaluations for responsiveness, empathy, and accuracy, while receiving a neutral rating for usefulness. CONCLUSION This subjective analysis suggests that leveraging large language models to generate responses to patient messages demonstrates significant potential in facilitating communication between patients and healthcare providers.
Collapse
Affiliation(s)
- Siru Liu
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37212, United States
| | - Allison B McCoy
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37212, United States
| | - Aileen P Wright
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37212, United States
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37212, United States
| | - Babatunde Carew
- Department of General Internal Medicine and Public Health, Vanderbilt University Medical Center, Nashville, TN 37212, United States
| | - Julian Z Genkins
- Department of Medicine, Stanford University, Stanford, CA 94304, United States
| | - Sean S Huang
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37212, United States
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37212, United States
| | - Josh F Peterson
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37212, United States
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37212, United States
| | - Bryan Steitz
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37212, United States
| | - Adam Wright
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37212, United States
| |
Collapse
|
109
|
Levin G, Pareja R, Viveros-Carreño D, Sanchez Diaz E, Yates EM, Zand B, Ramirez PT. Association of reviewer experience with discriminating human-written versus ChatGPT-written abstracts. Int J Gynecol Cancer 2024; 34:669-674. [PMID: 38627032 DOI: 10.1136/ijgc-2023-005162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 03/22/2024] [Indexed: 04/24/2024] Open
Abstract
OBJECTIVE To determine if reviewer experience impacts the ability to discriminate between human-written and ChatGPT-written abstracts. METHODS Thirty reviewers (10 seniors, 10 juniors, and 10 residents) were asked to differentiate between 10 ChatGPT-written and 10 human-written (fabricated) abstracts. For the study, 10 gynecologic oncology abstracts were fabricated by the authors. For each human-written abstract we generated a ChatGPT matching abstract by using the same title and the fabricated results of each of the human generated abstracts. A web-based questionnaire was used to gather demographic data and to record the reviewers' evaluation of the 20 abstracts. Comparative statistics and multivariable regression were used to identify factors associated with a higher correct identification rate. RESULTS The 30 reviewers discriminated 20 abstracts, giving a total of 600 abstract evaluations. The reviewers were able to correctly identify 300/600 (50%) of the abstracts: 139/300 (46.3%) of the ChatGPT-generated abstracts and 161/300 (53.7%) of the human-written abstracts (p=0.07). Human-written abstracts had a higher rate of correct identification (median (IQR) 56.7% (49.2-64.1%) vs 45.0% (43.2-48.3%), p=0.023). Senior reviewers had a higher correct identification rate (60%) than junior reviewers and residents (45% each; p=0.043 and p=0.002, respectively). In a linear regression model including the experience level of the reviewers, familiarity with artificial intelligence (AI) and the country in which the majority of medical training was achieved (English speaking vs non-English speaking), the experience of the reviewer (β=10.2 (95% CI 1.8 to 18.7)) and familiarity with AI (β=7.78 (95% CI 0.6 to 15.0)) were independently associated with the correct identification rate (p=0.019 and p=0.035, respectively). In a correlation analysis the number of publications by the reviewer was positively correlated with the correct identification rate (r28)=0.61, p<0.001. CONCLUSION A total of 46.3% of abstracts written by ChatGPT were detected by reviewers. The correct identification rate increased with reviewer and publication experience.
Collapse
Affiliation(s)
- Gabriel Levin
- Division of Gynecologic Oncology, Jewish General Hospital, McGill University, Montreal, Quebec, Canada
| | - Rene Pareja
- Gynecologic Oncology, Clinica ASTORGA, Medellin, and Instituto Nacional de Cancerología, Bogotá, Colombia
| | - David Viveros-Carreño
- Unidad Ginecología Oncológica, Grupo de Investigación GIGA, Centro de Tratamiento e Investigación sobre Cáncer Luis Carlos Sarmiento Angulo - CTIC, Bogotá, Colombia
- Department of Gynecologic Oncology, Clínica Universitaria Colombia, Bogotá, Colombia
| | - Emmanuel Sanchez Diaz
- Universidad Pontificia Bolivariana Clinica Universitaria Bolivariana, Medellin, Colombia
| | - Elise Mann Yates
- Obstetrics and Gynecology, Houston Methodist Hospital, Houston, Texas, USA
| | - Behrouz Zand
- Gynecologic Oncology, Houston Methodist, Shenandoah, Texas, USA
| | - Pedro T Ramirez
- Department of Obstetrics and Gynecology, Houston Methodist Hospital, Houston, Texas, USA
| |
Collapse
|
110
|
Mohammad-Rahimi H, Khoury ZH, Alamdari MI, Rokhshad R, Motie P, Parsa A, Tavares T, Sciubba JJ, Price JB, Sultan AS. Performance of AI chatbots on controversial topics in oral medicine, pathology, and radiology. Oral Surg Oral Med Oral Pathol Oral Radiol 2024; 137:508-514. [PMID: 38553304 DOI: 10.1016/j.oooo.2024.01.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 12/20/2023] [Accepted: 01/25/2024] [Indexed: 06/19/2024]
Abstract
OBJECTIVES In this study, we assessed 6 different artificial intelligence (AI) chatbots (Bing, GPT-3.5, GPT-4, Google Bard, Claude, Sage) responses to controversial and difficult questions in oral pathology, oral medicine, and oral radiology. STUDY DESIGN The chatbots' answers were evaluated by board-certified specialists using a modified version of the global quality score on a 5-point Likert scale. The quality and validity of chatbot citations were evaluated. RESULTS Claude had the highest mean score of 4.341 ± 0.582 for oral pathology and medicine. Bing had the lowest scores of 3.447 ± 0.566. In oral radiology, GPT-4 had the highest mean score of 3.621 ± 1.009 and Bing the lowest score of 2.379 ± 0.978. GPT-4 achieved the highest mean score of 4.066 ± 0.825 for performance across all disciplines. 82 out of 349 (23.50%) of generated citations from chatbots were fake. CONCLUSIONS The most superior chatbot in providing high-quality information for controversial topics in various dental disciplines was GPT-4. Although the majority of chatbots performed well, it is suggested that developers of AI medical chatbots incorporate scientific citation authenticators to validate the outputted citations given the relatively high number of fabricated citations.
Collapse
Affiliation(s)
- Hossein Mohammad-Rahimi
- Division of Artificial Intelligence Research, University of Maryland School of Dentistry, Baltimore, MD, USA; Topic Group Dental Diagnostics and Digital Dentistry, ITU/WHO Focus Group AI on Health, Berlin, Germany
| | - Zaid H Khoury
- Department of Oral Diagnostic Sciences and Research, Meharry Medical College School of Dentistry, Nashville, TN, USA
| | - Mina Iranparvar Alamdari
- Department of Oral and Maxillofacial Radiology, School of Dentistry, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Rata Rokhshad
- Topic Group Dental Diagnostics and Digital Dentistry, ITU/WHO Focus Group AI on Health, Berlin, Germany
| | - Parisa Motie
- Medical Image and Signal Processing Research Center, Medical University of Isfahan, Isfahan, Iran
| | - Azin Parsa
- Department of Oncology and Diagnostic Sciences, University of Maryland School of Dentistry, Baltimore, MD, USA
| | - Tiffany Tavares
- Department of Comprehensive Dentistry, UT Health San Antonio School of Dentistry, San Antonio, TX, USA
| | - James J Sciubba
- Department of Otolaryngology, Head & Neck Surgery, The Johns Hopkins University, Baltimore, MD, USA
| | - Jeffery B Price
- Division of Artificial Intelligence Research, University of Maryland School of Dentistry, Baltimore, MD, USA; Department of Oncology and Diagnostic Sciences, University of Maryland School of Dentistry, Baltimore, MD, USA
| | - Ahmed S Sultan
- Division of Artificial Intelligence Research, University of Maryland School of Dentistry, Baltimore, MD, USA; Department of Oncology and Diagnostic Sciences, University of Maryland School of Dentistry, Baltimore, MD, USA; University of Maryland Marlene and Stewart Greenebaum Comprehensive Cancer Center, Baltimore, MD, USA.
| |
Collapse
|
111
|
Jokar M, Abdous A, Rahmanian V. AI chatbots in pet health care: Opportunities and challenges for owners. Vet Med Sci 2024; 10:e1464. [PMID: 38678576 PMCID: PMC11056198 DOI: 10.1002/vms3.1464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Accepted: 04/04/2024] [Indexed: 05/01/2024] Open
Abstract
The integration of artificial intelligence (AI) into health care has seen remarkable advancements, with applications extending to animal health. This article explores the potential benefits and challenges associated with employing AI chatbots as tools for pet health care. Focusing on ChatGPT, a prominent language model, the authors elucidate its capabilities and its potential impact on pet owners' decision-making processes. AI chatbots offer pet owners access to extensive information on animal health, research studies and diagnostic options, providing a cost-effective and convenient alternative to traditional veterinary consultations. The fate of a case involving a Border Collie named Sassy demonstrates the potential benefits of AI in veterinary medicine. In this instance, ChatGPT played a pivotal role in suggesting a diagnosis that led to successful treatment, showcasing the potential of AI chatbots as valuable tools in complex cases. However, concerns arise regarding pet owners relying solely on AI chatbots for medical advice, potentially resulting in misdiagnosis, inappropriate treatment and delayed professional intervention. We emphasize the need for a balanced approach, positioning AI chatbots as supplementary tools rather than substitutes for licensed veterinarians. To mitigate risks, the article proposes strategies such as educating pet owners on AI chatbots' limitations, implementing regulations to guide AI chatbot companies and fostering collaboration between AI chatbots and veterinarians. The intricate web of responsibilities in this dynamic landscape underscores the importance of government regulations, the educational role of AI chatbots and the symbiotic relationship between AI technology and veterinary expertise. In conclusion, while AI chatbots hold immense promise in transforming pet health care, cautious and informed usage is crucial. By promoting awareness, establishing regulations and fostering collaboration, the article advocates for a responsible integration of AI chatbots to ensure optimal care for pets.
Collapse
Affiliation(s)
- Mohammad Jokar
- Faculty of Veterinary MedicineKaraj BranchIslamic Azad UniversityKarajIran
| | - Arman Abdous
- Faculty of Veterinary MedicineKaraj BranchIslamic Azad UniversityKarajIran
| | - Vahid Rahmanian
- Department of Public HealthTorbat Jam Faculty of Medical SciencesTorbat JamIran
| |
Collapse
|
112
|
Huo B, Calabrese E, Sylla P, Kumar S, Ignacio RC, Oviedo R, Hassan I, Slater BJ, Kaiser A, Walsh DS, Vosburg W. The performance of artificial intelligence large language model-linked chatbots in surgical decision-making for gastroesophageal reflux disease. Surg Endosc 2024; 38:2320-2330. [PMID: 38630178 DOI: 10.1007/s00464-024-10807-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 03/21/2024] [Indexed: 07/11/2024]
Abstract
BACKGROUND Large language model (LLM)-linked chatbots may be an efficient source of clinical recommendations for healthcare providers and patients. This study evaluated the performance of LLM-linked chatbots in providing recommendations for the surgical management of gastroesophageal reflux disease (GERD). METHODS Nine patient cases were created based on key questions addressed by the Society of American Gastrointestinal and Endoscopic Surgeons (SAGES) guidelines for the surgical treatment of GERD. ChatGPT-3.5, ChatGPT-4, Copilot, Google Bard, and Perplexity AI were queried on November 16th, 2023, for recommendations regarding the surgical management of GERD. Accurate chatbot performance was defined as the number of responses aligning with SAGES guideline recommendations. Outcomes were reported with counts and percentages. RESULTS Surgeons were given accurate recommendations for the surgical management of GERD in an adult patient for 5/7 (71.4%) KQs by ChatGPT-4, 3/7 (42.9%) KQs by Copilot, 6/7 (85.7%) KQs by Google Bard, and 3/7 (42.9%) KQs by Perplexity according to the SAGES guidelines. Patients were given accurate recommendations for 3/5 (60.0%) KQs by ChatGPT-4, 2/5 (40.0%) KQs by Copilot, 4/5 (80.0%) KQs by Google Bard, and 1/5 (20.0%) KQs by Perplexity, respectively. In a pediatric patient, surgeons were given accurate recommendations for 2/3 (66.7%) KQs by ChatGPT-4, 3/3 (100.0%) KQs by Copilot, 3/3 (100.0%) KQs by Google Bard, and 2/3 (66.7%) KQs by Perplexity. Patients were given appropriate guidance for 2/2 (100.0%) KQs by ChatGPT-4, 2/2 (100.0%) KQs by Copilot, 1/2 (50.0%) KQs by Google Bard, and 1/2 (50.0%) KQs by Perplexity. CONCLUSIONS Gastrointestinal surgeons, gastroenterologists, and patients should recognize both the promise and pitfalls of LLM's when utilized for advice on surgical management of GERD. Additional training of LLM's using evidence-based health information is needed.
Collapse
Affiliation(s)
- Bright Huo
- Division of General Surgery, Department of Surgery, McMaster University, Hamilton, ON, Canada
| | - Elisa Calabrese
- University of California South California, East Bay, Oakland, CA, USA
| | - Patricia Sylla
- Division of Colon and Rectal Surgery, Department of Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Sunjay Kumar
- Department of General Surgery, Thomas Jefferson University Hospital, Philadelphia, PA, USA
| | - Romeo C Ignacio
- Division of Pediatric Surgery/Department of Surgery, San Diego School of Medicine, University of California, California, CA, USA
| | - Rodolfo Oviedo
- Nacogdoches Center for Metabolic and Weight Loss Surgery, Nacogdoches, TX, USA
- University of Houston Tilman J. Fertitta Family College of Medicine, Houston, TX, USA
- Sam Houston State University College of Osteopathic Medicine, Conroe, TX, USA
| | | | | | - Andreas Kaiser
- Division of Colorectal Surgery, Department of Surgery, City of Hope National Medical Center, Duarte, CA, USA
| | - Danielle S Walsh
- Department of Surgery, University of Kentucky, Lexington, KY, USA
| | - Wesley Vosburg
- Department of Surgery, Harvard Medical School, Mount Auburn Hospital, Cambridge, MA, USA.
| |
Collapse
|
113
|
Huo B, Calabrese E, Sylla P, Kumar S, Ignacio RC, Oviedo R, Hassan I, Slater BJ, Kaiser A, Walsh DS, Vosburg W. The performance of artificial intelligence large language model-linked chatbots in surgical decision-making for gastroesophageal reflux disease. Surg Endosc 2024; 38:2320-2330. [PMID: 38630178 DOI: 10.1007/s00464-024-10807-w] [citation(s)] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 03/21/2024] [Indexed: 08/16/2024]
Abstract
BACKGROUND Large language model (LLM)-linked chatbots may be an efficient source of clinical recommendations for healthcare providers and patients. This study evaluated the performance of LLM-linked chatbots in providing recommendations for the surgical management of gastroesophageal reflux disease (GERD). METHODS Nine patient cases were created based on key questions addressed by the Society of American Gastrointestinal and Endoscopic Surgeons (SAGES) guidelines for the surgical treatment of GERD. ChatGPT-3.5, ChatGPT-4, Copilot, Google Bard, and Perplexity AI were queried on November 16th, 2023, for recommendations regarding the surgical management of GERD. Accurate chatbot performance was defined as the number of responses aligning with SAGES guideline recommendations. Outcomes were reported with counts and percentages. RESULTS Surgeons were given accurate recommendations for the surgical management of GERD in an adult patient for 5/7 (71.4%) KQs by ChatGPT-4, 3/7 (42.9%) KQs by Copilot, 6/7 (85.7%) KQs by Google Bard, and 3/7 (42.9%) KQs by Perplexity according to the SAGES guidelines. Patients were given accurate recommendations for 3/5 (60.0%) KQs by ChatGPT-4, 2/5 (40.0%) KQs by Copilot, 4/5 (80.0%) KQs by Google Bard, and 1/5 (20.0%) KQs by Perplexity, respectively. In a pediatric patient, surgeons were given accurate recommendations for 2/3 (66.7%) KQs by ChatGPT-4, 3/3 (100.0%) KQs by Copilot, 3/3 (100.0%) KQs by Google Bard, and 2/3 (66.7%) KQs by Perplexity. Patients were given appropriate guidance for 2/2 (100.0%) KQs by ChatGPT-4, 2/2 (100.0%) KQs by Copilot, 1/2 (50.0%) KQs by Google Bard, and 1/2 (50.0%) KQs by Perplexity. CONCLUSIONS Gastrointestinal surgeons, gastroenterologists, and patients should recognize both the promise and pitfalls of LLM's when utilized for advice on surgical management of GERD. Additional training of LLM's using evidence-based health information is needed.
Collapse
Affiliation(s)
- Bright Huo
- Division of General Surgery, Department of Surgery, McMaster University, Hamilton, ON, Canada
| | - Elisa Calabrese
- University of California South California, East Bay, Oakland, CA, USA
| | - Patricia Sylla
- Division of Colon and Rectal Surgery, Department of Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Sunjay Kumar
- Department of General Surgery, Thomas Jefferson University Hospital, Philadelphia, PA, USA
| | - Romeo C Ignacio
- Division of Pediatric Surgery/Department of Surgery, San Diego School of Medicine, University of California, California, CA, USA
| | - Rodolfo Oviedo
- Nacogdoches Center for Metabolic and Weight Loss Surgery, Nacogdoches, TX, USA
- University of Houston Tilman J. Fertitta Family College of Medicine, Houston, TX, USA
- Sam Houston State University College of Osteopathic Medicine, Conroe, TX, USA
| | | | | | - Andreas Kaiser
- Division of Colorectal Surgery, Department of Surgery, City of Hope National Medical Center, Duarte, CA, USA
| | - Danielle S Walsh
- Department of Surgery, University of Kentucky, Lexington, KY, USA
| | - Wesley Vosburg
- Department of Surgery, Harvard Medical School, Mount Auburn Hospital, Cambridge, MA, USA.
| |
Collapse
|
114
|
Wang L, Wan Z, Ni C, Song Q, Li Y, Clayton EW, Malin BA, Yin Z. A Systematic Review of ChatGPT and Other Conversational Large Language Models in Healthcare. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.26.24306390. [PMID: 38712148 PMCID: PMC11071576 DOI: 10.1101/2024.04.26.24306390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Background The launch of the Chat Generative Pre-trained Transformer (ChatGPT) in November 2022 has attracted public attention and academic interest to large language models (LLMs), facilitating the emergence of many other innovative LLMs. These LLMs have been applied in various fields, including healthcare. Numerous studies have since been conducted regarding how to employ state-of-the-art LLMs in health-related scenarios to assist patients, doctors, and public health administrators. Objective This review aims to summarize the applications and concerns of applying conversational LLMs in healthcare and provide an agenda for future research on LLMs in healthcare. Methods We utilized PubMed, ACM, and IEEE digital libraries as primary sources for this review. We followed the guidance of Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRIMSA) to screen and select peer-reviewed research articles that (1) were related to both healthcare applications and conversational LLMs and (2) were published before September 1st, 2023, the date when we started paper collection and screening. We investigated these papers and classified them according to their applications and concerns. Results Our search initially identified 820 papers according to targeted keywords, out of which 65 papers met our criteria and were included in the review. The most popular conversational LLM was ChatGPT from OpenAI (60), followed by Bard from Google (1), Large Language Model Meta AI (LLaMA) from Meta (1), and other LLMs (5). These papers were classified into four categories in terms of their applications: 1) summarization, 2) medical knowledge inquiry, 3) prediction, and 4) administration, and four categories of concerns: 1) reliability, 2) bias, 3) privacy, and 4) public acceptability. There are 49 (75%) research papers using LLMs for summarization and/or medical knowledge inquiry, and 58 (89%) research papers expressing concerns about reliability and/or bias. We found that conversational LLMs exhibit promising results in summarization and providing medical knowledge to patients with a relatively high accuracy. However, conversational LLMs like ChatGPT are not able to provide reliable answers to complex health-related tasks that require specialized domain expertise. Additionally, no experiments in our reviewed papers have been conducted to thoughtfully examine how conversational LLMs lead to bias or privacy issues in healthcare research. Conclusions Future studies should focus on improving the reliability of LLM applications in complex health-related tasks, as well as investigating the mechanisms of how LLM applications brought bias and privacy issues. Considering the vast accessibility of LLMs, legal, social, and technical efforts are all needed to address concerns about LLMs to promote, improve, and regularize the application of LLMs in healthcare.
Collapse
Affiliation(s)
- Leyao Wang
- Department of Computer Science, Vanderbilt University, Nashville, TN, USA, 37212
| | - Zhiyu Wan
- Department of Biomedical Informatics, Vanderbilt University Medical Center, TN, USA, 37203
| | - Congning Ni
- Department of Computer Science, Vanderbilt University, Nashville, TN, USA, 37212
| | - Qingyuan Song
- Department of Computer Science, Vanderbilt University, Nashville, TN, USA, 37212
| | - Yang Li
- Department of Computer Science, Vanderbilt University, Nashville, TN, USA, 37212
| | - Ellen Wright Clayton
- Department of Pediatrics, Vanderbilt University Medical Center, Nashville, Tennessee, USA, 37203
- Center for Biomedical Ethics and Society, Vanderbilt University Medical Center, Nashville, Tennessee, USA, 37203
| | - Bradley A. Malin
- Department of Computer Science, Vanderbilt University, Nashville, TN, USA, 37212
- Department of Biomedical Informatics, Vanderbilt University Medical Center, TN, USA, 37203
- Department of Biostatistics, Vanderbilt University Medical Center, TN, USA, 37203
| | - Zhijun Yin
- Department of Computer Science, Vanderbilt University, Nashville, TN, USA, 37212
- Department of Biomedical Informatics, Vanderbilt University Medical Center, TN, USA, 37203
| |
Collapse
|
115
|
Wu J, Ma Y, Wang J, Xiao M. The Application of ChatGPT in Medicine: A Scoping Review and Bibliometric Analysis. J Multidiscip Healthc 2024; 17:1681-1692. [PMID: 38650670 PMCID: PMC11034560 DOI: 10.2147/jmdh.s463128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Accepted: 03/25/2024] [Indexed: 04/25/2024] Open
Abstract
Purpose ChatGPT has a wide range of applications in the medical field. Therefore, this review aims to define the key issues and provide a comprehensive view of the literature based on the application of ChatGPT in medicine. Methods This scope follows Arksey and O'Malley's five-stage framework. A comprehensive literature search of publications (30 November 2022 to 16 August 2023) was conducted. Six databases were searched and relevant references were systematically catalogued. Attention was focused on the general characteristics of the articles, their fields of application, and the advantages and disadvantages of using ChatGPT. Descriptive statistics and narrative synthesis methods were used for data analysis. Results Of the 3426 studies, 247 met the criteria for inclusion in this review. The majority of articles (31.17%) were from the United States. Editorials (43.32%) ranked first, followed by experimental studys (11.74%). The potential applications of ChatGPT in medicine are varied, with the largest number of studies (45.75%) exploring clinical practice, including assisting with clinical decision support and providing disease information and medical advice. This was followed by medical education (27.13%) and scientific research (16.19%). Particularly noteworthy in the discipline statistics were radiology, surgery and dentistry at the top of the list. However, ChatGPT in medicine also faces issues of data privacy, inaccuracy and plagiarism. Conclusion The application of ChatGPT in medicine focuses on different disciplines and general application scenarios. ChatGPT has a paradoxical nature: it offers significant advantages, but at the same time raises great concerns about its application in healthcare settings. Therefore, it is imperative to develop theoretical frameworks that not only address its widespread use in healthcare but also facilitate a comprehensive assessment. In addition, these frameworks should contribute to the development of strict and effective guidelines and regulatory measures.
Collapse
Affiliation(s)
- Jie Wu
- Department of Nursing, the First Affiliated Hospital of Chongqing Medical University, Chongqing, People’s Republic of China
| | - Yingzhuo Ma
- Department of Nursing, the First Affiliated Hospital of Chongqing Medical University, Chongqing, People’s Republic of China
| | - Jun Wang
- Department of Nursing, the First Affiliated Hospital of Chongqing Medical University, Chongqing, People’s Republic of China
| | - Mingzhao Xiao
- Department of Urology, the First Affiliated Hospital of Chongqing Medical University, Chongqing, People’s Republic of China
| |
Collapse
|
116
|
Wimbarti S, Kairupan BHR, Tallei TE. Critical review of self-diagnosis of mental health conditions using artificial intelligence. Int J Ment Health Nurs 2024; 33:344-358. [PMID: 38345132 DOI: 10.1111/inm.13303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 01/26/2024] [Accepted: 01/30/2024] [Indexed: 03/10/2024]
Abstract
The advent of artificial intelligence (AI) has revolutionised various aspects of our lives, including mental health nursing. AI-driven tools and applications have provided a convenient and accessible means for individuals to assess their mental well-being within the confines of their homes. Nonetheless, the widespread trend of self-diagnosing mental health conditions through AI poses considerable risks. This review article examines the perils associated with relying on AI for self-diagnosis in mental health, highlighting the constraints and possible adverse outcomes that can arise from such practices. It delves into the ethical, psychological, and social implications, underscoring the vital role of mental health professionals, including psychologists, psychiatrists, and nursing specialists, in providing professional assistance and guidance. This article aims to highlight the importance of seeking professional assistance and guidance in addressing mental health concerns, especially in the era of AI-driven self-diagnosis.
Collapse
Affiliation(s)
- Supra Wimbarti
- Faculty of Psychology, Universitas Gadjah Mada, Yogyakarta, Indonesia
| | - B H Ralph Kairupan
- Department of Psychiatry, Faculty of Medicine, Sam Ratulangi University, Manado, North Sulawesi, Indonesia
| | - Trina Ekawati Tallei
- Department of Biology, Faculty of Mathematics and Natural Sciences, Sam Ratulangi University, Manado, North Sulawesi, Indonesia
- Department of Biology, Faculty of Medicine, Sam Ratulangi University, Manado, North Sulawesi, Indonesia
| |
Collapse
|
117
|
Ge J, Li M, Delk MB, Lai JC. A Comparison of a Large Language Model vs Manual Chart Review for the Extraction of Data Elements From the Electronic Health Record. Gastroenterology 2024; 166:707-709.e3. [PMID: 38151192 PMCID: PMC11792087 DOI: 10.1053/j.gastro.2023.12.019] [Citation(s) in RCA: 25] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 12/10/2023] [Accepted: 12/18/2023] [Indexed: 12/29/2023]
Affiliation(s)
- Jin Ge
- Division of Gastroenterology and Hepatology, Department of Medicine, University of California, San Francisco, San Francisco, California.
| | - Michael Li
- Division of Gastroenterology and Hepatology, Department of Medicine, University of California, San Francisco, San Francisco, California
| | - Molly B Delk
- Section of Gastroenterology and Hepatology, Department of Medicine, Tulane University School of Medicine, New Orleans, Louisiana
| | - Jennifer C Lai
- Division of Gastroenterology and Hepatology, Department of Medicine, University of California, San Francisco, San Francisco, California
| |
Collapse
|
118
|
Ahimaz P, Bergner AL, Florido ME, Harkavy N, Bhattacharyya S. Genetic counselors' utilization of ChatGPT in professional practice: A cross-sectional study. Am J Med Genet A 2024; 194:e63493. [PMID: 38066714 DOI: 10.1002/ajmg.a.63493] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 11/21/2023] [Accepted: 11/22/2023] [Indexed: 03/10/2024]
Abstract
PURPOSE The precision medicine era has seen increased utilization of artificial intelligence (AI) in the field of genetics. We sought to explore the ways that genetic counselors (GCs) currently use the publicly accessible AI tool Chat Generative Pre-trained Transformer (ChatGPT) in their work. METHODS GCs in North America were surveyed about how ChatGPT is used in different aspects of their work. Descriptive statistics were reported through frequencies and means. RESULTS Of 118 GCs who completed the survey, 33.8% (40) reported using ChatGPT in their work; 47.5% (19) use it in clinical practice, 35% (14) use it in education, and 32.5% (13) use it in research. Most GCs (62.7%; 74) felt that it saves time on administrative tasks but the majority (82.2%; 97) felt that a paramount challenge was the risk of obtaining incorrect information. The majority of GCs not using ChatGPT (58.9%; 46) felt it was not necessary for their work. CONCLUSION A considerable number of GCs in the field are using ChatGPT in different ways, but it is primarily helpful with tasks that involve writing. It has potential to streamline workflow issues encountered in clinical genetics, but practitioners need to be informed and uniformly trained about its limitations.
Collapse
Affiliation(s)
- Priyanka Ahimaz
- Genetic Counseling Graduate Program, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
- Department of Pediatrics, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Amanda L Bergner
- Genetic Counseling Graduate Program, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
- Department of Genetics and Development, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
- Department of Neurology, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Michelle E Florido
- Genetic Counseling Graduate Program, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
- Department of Genetics and Development, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Nina Harkavy
- Genetic Counseling Graduate Program, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
- Department of Obstetrics and Gynecology, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| | - Sriya Bhattacharyya
- Genetic Counseling Graduate Program, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
- Department of Psychiatry, Vagelos College of Physicians and Surgeons, Columbia University, New York, New York, USA
| |
Collapse
|
119
|
Liu XQ, Zhang ZR. Potential use of large language models for mitigating students' problematic social media use: ChatGPT as an example. World J Psychiatry 2024; 14:334-341. [PMID: 38617990 PMCID: PMC11008388 DOI: 10.5498/wjp.v14.i3.334] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Revised: 01/15/2024] [Accepted: 02/05/2024] [Indexed: 03/19/2024] Open
Abstract
The problematic use of social media has numerous negative impacts on individuals' daily lives, interpersonal relationships, physical and mental health, and more. Currently, there are few methods and tools to alleviate problematic social media, and their potential is yet to be fully realized. Emerging large language models (LLMs) are becoming increasingly popular for providing information and assistance to people and are being applied in many aspects of life. In mitigating problematic social media use, LLMs such as ChatGPT can play a positive role by serving as conversational partners and outlets for users, providing personalized information and resources, monitoring and intervening in problematic social media use, and more. In this process, we should recognize both the enormous potential and endless possibilities of LLMs such as ChatGPT, leveraging their advantages to better address problematic social media use, while also acknowledging the limitations and potential pitfalls of ChatGPT technology, such as errors, limitations in issue resolution, privacy and security concerns, and potential overreliance. When we leverage the advantages of LLMs to address issues in social media usage, we must adopt a cautious and ethical approach, being vigilant of the potential adverse effects that LLMs may have in addressing problematic social media use to better harness technology to serve individuals and society.
Collapse
Affiliation(s)
- Xin-Qiao Liu
- School of Education, Tianjin University, Tianjin 300350, China
| | - Zi-Ru Zhang
- School of Education, Tianjin University, Tianjin 300350, China
| |
Collapse
|
120
|
Kaufmann B, Busby D, Das CK, Tillu N, Menon M, Tewari AK, Gorin MA. Validation of a Zero-shot Learning Natural Language Processing Tool to Facilitate Data Abstraction for Urologic Research. Eur Urol Focus 2024; 10:279-287. [PMID: 38278710 DOI: 10.1016/j.euf.2024.01.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 12/18/2023] [Accepted: 01/15/2024] [Indexed: 01/28/2024]
Abstract
BACKGROUND Urologic research often requires data abstraction from unstructured text contained within the electronic health record. A number of natural language processing (NLP) tools have been developed to aid with this time-consuming task; however, the generalizability of these tools is typically limited by the need for task-specific training. OBJECTIVE To describe the development and validation of a zero-shot learning NLP tool to facilitate data abstraction from unstructured text for use in downstream urologic research. DESIGN, SETTING, AND PARTICIPANTS An NLP tool based on the GPT-3.5 model from OpenAI was developed and compared with three physicians for time to task completion and accuracy for abstracting 14 unique variables from a set of 199 deidentified radical prostatectomy pathology reports. The reports were processed in vectorized and scanned formats to establish the impact of optical character recognition on data abstraction. INTERVENTION A zero-shot learning NLP tool for data abstraction. OUTCOME MEASUREMENTS AND STATISTICAL ANALYSIS The tool was compared with the human abstractors in terms of superiority for data abstraction speed and noninferiority for accuracy. RESULTS AND LIMITATIONS The human abstractors required a median (interquartile range) of 93 s (72-122 s) per report for data abstraction, whereas the software required a median of 12 s (10-15 s) for the vectorized reports and 15 s (13-17 s) for the scanned reports (p < 0.001 for all paired comparisons). The accuracies of the three human abstractors were 94.7% (95% confidence interval [CI], 93.8-95.5%), 97.8% (95% CI, 97.2-98.3%), and 96.4% (95% CI, 95.6-97%) for the combined set of 2786 data points. The tool had accuracy of 94.2% (95% CI, 93.3-94.9%) for the vectorized reports and was noninferior to the human abstractors at a margin of -10% (α = 0.025). The tool had slightly lower accuracy of 88.7% (95% CI 87.5-89.9%) for the scanned reports, making it noninferior to two of three human abstractors. CONCLUSIONS The developed zero-shot learning NLP tool offers urologic researchers a highly generalizable and accurate method for data abstraction from unstructured text. An open access version of the tool is available for immediate use by the urologic community. PATIENT SUMMARY In this report, we describe the design and validation of an artificial intelligence tool for abstracting discrete data from unstructured notes contained within the electronic medical record. This freely available tool, which is based on the GPT-3.5 technology from OpenAI, is intended to facilitate research and scientific discovery by the urologic community.
Collapse
Affiliation(s)
- Basil Kaufmann
- Milton and Carroll Petrie Department of Urology, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Department of Urology, University Hospital Zurich, University of Zurich, Zurich, Switzerland.
| | - Dallin Busby
- Milton and Carroll Petrie Department of Urology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Chandan Krushna Das
- Milton and Carroll Petrie Department of Urology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Neeraja Tillu
- Milton and Carroll Petrie Department of Urology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Mani Menon
- Milton and Carroll Petrie Department of Urology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ashutosh K Tewari
- Milton and Carroll Petrie Department of Urology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Michael A Gorin
- Milton and Carroll Petrie Department of Urology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
121
|
Hossain MM. Using ChatGPT and other forms of generative AI in systematic reviews: Challenges and opportunities. J Med Imaging Radiat Sci 2024; 55:11-12. [PMID: 38040497 DOI: 10.1016/j.jmir.2023.11.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 11/20/2023] [Indexed: 12/03/2023]
Affiliation(s)
- M Mahbub Hossain
- Department of Decision and Information Sciences, C.T. Bauer College of Business, University of Houston, TX 77204, USA; Department of Health Systems and Population Health Sciences, Tilman J. Fertitta Family College of Medicine, University of Houston, TX 77204, USA.
| |
Collapse
|
122
|
Çiftci N, Sarman A, Yıldız M, Çiftci K. Use of ChatGPT in health: benefits, hazards, and recommendations. Public Health 2024; 228:e1-e2. [PMID: 38346914 DOI: 10.1016/j.puhe.2023.12.032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/11/2023] [Accepted: 12/28/2023] [Indexed: 03/16/2024]
Affiliation(s)
- N Çiftci
- Department of Nursing, Faculty of Health Sciences, Muş Alparslan University, Muş, Turkey
| | - A Sarman
- Department of Pediatric Nursing, Faculty of Health Science, Bingöl University, Bingöl, Turkey.
| | - M Yıldız
- Department of Midwifery, Faculty of Health Science, Sakarya University, Sakarya, Turkey
| | - K Çiftci
- Department of Medical Services and Techniques, Vocational School of Health Services, Muş Alparslan University, Muş, Turkey
| |
Collapse
|
123
|
Tao BK, Handzic A, Hua NJ, Vosoughi AR, Margolin EA, Micieli JA. Utility of ChatGPT for Automated Creation of Patient Education Handouts: An Application in Neuro-Ophthalmology. J Neuroophthalmol 2024; 44:119-124. [PMID: 38175720 DOI: 10.1097/wno.0000000000002074] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
BACKGROUND Patient education in ophthalmology poses a challenge for physicians because of time and resource limitations. ChatGPT (OpenAI, San Francisco) may assist with automating production of patient handouts on common neuro-ophthalmic diseases. METHODS We queried ChatGPT-3.5 to generate 51 patient education handouts across 17 conditions. We devised the "Quality of Generated Language Outputs for Patients" (QGLOP) tool to assess handouts on the domains of accuracy/comprehensiveness, bias, currency, and tone, each scored out of 4 for a total of 16. A fellowship-trained neuro-ophthalmologist scored each passage. Handout readability was assessed using the Simple Measure of Gobbledygook (SMOG), which estimates years of education required to understand a text. RESULTS The QGLOP scores for accuracy, bias, currency, and tone were found to be 2.43, 3, 3.43, and 3.02 respectively. The mean QGLOP score was 11.9 [95% CI 8.98, 14.8] out of 16 points, indicating a performance of 74.4% [95% CI 56.1%, 92.5%]. The mean SMOG across responses as 10.9 [95% CI 9.36, 12.4] years of education. CONCLUSIONS The mean QGLOP score suggests that a fellowship-trained ophthalmologist may have at-least a moderate level of satisfaction with the write-up quality conferred by ChatGPT. This still requires a final review and editing before dissemination. Comparatively, the rarer 5% of responses collectively on either extreme would require very mild or extensive revision. Also, the mean SMOG score exceeded the accepted upper limits of grade 8 reading level for health-related patient handouts. In its current iteration, ChatGPT should be used as an efficiency tool to generate an initial draft for the neuro-ophthalmologist, who may then refine the accuracy and readability for a lay readership.
Collapse
Affiliation(s)
- Brendan K Tao
- Faculty of Medicine (BKT), The University of British Columbia, Vancouver, Canada ; Department of Ophthalmology & Vision Science (AH, EAM, JAM), University of Toronto, Toronto, Canada; Temerty Faculty of Medicine (NJH), University of Toronto, Toronto, Canada; Department of Ophthalmology (ARV), Max Rady College of Medicine, University of Manitoba, Winnipeg, Canada; Mount Sinai Hospital (EAM), Toronto, Canada; Division of Neurology (EAM, JAM), Department of Medicine, University of Toronto, Toronto, Canada; Toronto Western Hospital (EAM, JAM), Toronto, Canada; University Health Network (EAM, JAM), Toronto, Canada; Kensington Vision and Research Center (JAM), Toronto, Canada; and St. Michael's Hospital (JAM), Toronto, Canada
| | | | | | | | | | | |
Collapse
|
124
|
Tayebi Arasteh S, Han T, Lotfinia M, Kuhl C, Kather JN, Truhn D, Nebelung S. Large language models streamline automated machine learning for clinical studies. Nat Commun 2024; 15:1603. [PMID: 38383555 PMCID: PMC10881983 DOI: 10.1038/s41467-024-45879-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 02/06/2024] [Indexed: 02/23/2024] Open
Abstract
A knowledge gap persists between machine learning (ML) developers (e.g., data scientists) and practitioners (e.g., clinicians), hampering the full utilization of ML for clinical data analysis. We investigated the potential of the ChatGPT Advanced Data Analysis (ADA), an extension of GPT-4, to bridge this gap and perform ML analyses efficiently. Real-world clinical datasets and study details from large trials across various medical specialties were presented to ChatGPT ADA without specific guidance. ChatGPT ADA autonomously developed state-of-the-art ML models based on the original study's training data to predict clinical outcomes such as cancer development, cancer progression, disease complications, or biomarkers such as pathogenic gene sequences. Following the re-implementation and optimization of the published models, the head-to-head comparison of the ChatGPT ADA-crafted ML models and their respective manually crafted counterparts revealed no significant differences in traditional performance metrics (p ≥ 0.072). Strikingly, the ChatGPT ADA-crafted ML models often outperformed their counterparts. In conclusion, ChatGPT ADA offers a promising avenue to democratize ML in medicine by simplifying complex data analyses, yet should enhance, not replace, specialized training and resources, to promote broader applications in medical research and practice.
Collapse
Affiliation(s)
- Soroosh Tayebi Arasteh
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany.
| | - Tianyu Han
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany.
| | - Mahshad Lotfinia
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
- Institute of Heat and Mass Transfer, RWTH Aachen University, Aachen, Germany
| | - Christiane Kuhl
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| | - Jakob Nikolas Kather
- Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany
- Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany
| | - Daniel Truhn
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| | - Sven Nebelung
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| |
Collapse
|
125
|
Zhang YF, Liu XQ. Using ChatGPT to promote college students' participation in physical activities and its effect on mental health. World J Psychiatry 2024; 14:330-333. [PMID: 38464770 PMCID: PMC10921293 DOI: 10.5498/wjp.v14.i2.330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 12/19/2023] [Accepted: 01/23/2024] [Indexed: 02/04/2024] Open
Abstract
As one of the most famous large language models, ChatGPT has great potential for application in physical education. It can provide personalized exercise plans, a variety of exercise options, and interactive support. The integration of ChatGPT into the teaching process can promote college students' participation in physical activities and improve their mental health while expanding the traditional teaching environment and promoting the reform of traditional teaching methods. However, the application of ChatGPT faces challenges and obstacles in physical education. To make full use of ChatGPT in physical education, it can be combined with wearable devices and sports equipment to enhance the efficiency of interactions with users. Relevant policies are urgently needed to avoid the improper use of users' data.
Collapse
Affiliation(s)
- Yi-Fan Zhang
- School of Education, Tianjin University, Tianjin 300350, China
| | - Xin-Qiao Liu
- School of Education, Tianjin University, Tianjin 300350, China
| |
Collapse
|
126
|
Sallam M, Barakat M, Sallam M. A Preliminary Checklist (METRICS) to Standardize the Design and Reporting of Studies on Generative Artificial Intelligence-Based Models in Health Care Education and Practice: Development Study Involving a Literature Review. Interact J Med Res 2024; 13:e54704. [PMID: 38276872 PMCID: PMC10905357 DOI: 10.2196/54704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Revised: 12/18/2023] [Accepted: 01/26/2024] [Indexed: 01/27/2024] Open
Abstract
BACKGROUND Adherence to evidence-based practice is indispensable in health care. Recently, the utility of generative artificial intelligence (AI) models in health care has been evaluated extensively. However, the lack of consensus guidelines on the design and reporting of findings of these studies poses a challenge for the interpretation and synthesis of evidence. OBJECTIVE This study aimed to develop a preliminary checklist to standardize the reporting of generative AI-based studies in health care education and practice. METHODS A literature review was conducted in Scopus, PubMed, and Google Scholar. Published records with "ChatGPT," "Bing," or "Bard" in the title were retrieved. Careful examination of the methodologies employed in the included records was conducted to identify the common pertinent themes and the possible gaps in reporting. A panel discussion was held to establish a unified and thorough checklist for the reporting of AI studies in health care. The finalized checklist was used to evaluate the included records by 2 independent raters. Cohen κ was used as the method to evaluate the interrater reliability. RESULTS The final data set that formed the basis for pertinent theme identification and analysis comprised a total of 34 records. The finalized checklist included 9 pertinent themes collectively referred to as METRICS (Model, Evaluation, Timing, Range/Randomization, Individual factors, Count, and Specificity of prompts and language). Their details are as follows: (1) Model used and its exact settings; (2) Evaluation approach for the generated content; (3) Timing of testing the model; (4) Transparency of the data source; (5) Range of tested topics; (6) Randomization of selecting the queries; (7) Individual factors in selecting the queries and interrater reliability; (8) Count of queries executed to test the model; and (9) Specificity of the prompts and language used. The overall mean METRICS score was 3.0 (SD 0.58). The tested METRICS score was acceptable, with the range of Cohen κ of 0.558 to 0.962 (P<.001 for the 9 tested items). With classification per item, the highest average METRICS score was recorded for the "Model" item, followed by the "Specificity" item, while the lowest scores were recorded for the "Randomization" item (classified as suboptimal) and "Individual factors" item (classified as satisfactory). CONCLUSIONS The METRICS checklist can facilitate the design of studies guiding researchers toward best practices in reporting results. The findings highlight the need for standardized reporting algorithms for generative AI-based studies in health care, considering the variability observed in methodologies and reporting. The proposed METRICS checklist could be a preliminary helpful base to establish a universally accepted approach to standardize the design and reporting of generative AI-based studies in health care, which is a swiftly evolving research topic.
Collapse
Affiliation(s)
- Malik Sallam
- Department of Pathology, Microbiology and Forensic Medicine, School of Medicine, The University of Jordan, Amman, Jordan
- Department of Clinical Laboratories and Forensic Medicine, Jordan University Hospital, Amman, Jordan
- Department of Translational Medicine, Faculty of Medicine, Lund University, Malmo, Sweden
| | - Muna Barakat
- Department of Clinical Pharmacy and Therapeutics, Faculty of Pharmacy, Applied Science Private University, Amman, Jordan
| | - Mohammed Sallam
- Department of Pharmacy, Mediclinic Parkview Hospital, Mediclinic Middle East, Dubai, United Arab Emirates
| |
Collapse
|
127
|
Weidener L, Fischer M. Proposing a Principle-Based Approach for Teaching AI Ethics in Medical Education. JMIR MEDICAL EDUCATION 2024; 10:e55368. [PMID: 38285931 PMCID: PMC10891487 DOI: 10.2196/55368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 01/02/2024] [Accepted: 01/29/2024] [Indexed: 01/31/2024]
Abstract
The use of artificial intelligence (AI) in medicine, potentially leading to substantial advancements such as improved diagnostics, has been of increasing scientific and societal interest in recent years. However, the use of AI raises new ethical challenges, such as an increased risk of bias and potential discrimination against patients, as well as misdiagnoses potentially leading to over- or underdiagnosis with substantial consequences for patients. Recognizing these challenges, current research underscores the importance of integrating AI ethics into medical education. This viewpoint paper aims to introduce a comprehensive set of ethical principles for teaching AI ethics in medical education. This dynamic and principle-based approach is designed to be adaptive and comprehensive, addressing not only the current but also emerging ethical challenges associated with the use of AI in medicine. This study conducts a theoretical analysis of the current academic discourse on AI ethics in medical education, identifying potential gaps and limitations. The inherent interconnectivity and interdisciplinary nature of these anticipated challenges are illustrated through a focused discussion on "informed consent" in the context of AI in medicine and medical education. This paper proposes a principle-based approach to AI ethics education, building on the 4 principles of medical ethics-autonomy, beneficence, nonmaleficence, and justice-and extending them by integrating 3 public health ethics principles-efficiency, common good orientation, and proportionality. The principle-based approach to teaching AI ethics in medical education proposed in this study offers a foundational framework for addressing the anticipated ethical challenges of using AI in medicine, recommended in the current academic discourse. By incorporating the 3 principles of public health ethics, this principle-based approach ensures that medical ethics education remains relevant and responsive to the dynamic landscape of AI integration in medicine. As the advancement of AI technologies in medicine is expected to increase, medical ethics education must adapt and evolve accordingly. The proposed principle-based approach for teaching AI ethics in medical education provides an important foundation to ensure that future medical professionals are not only aware of the ethical dimensions of AI in medicine but also equipped to make informed ethical decisions in their practice. Future research is required to develop problem-based and competency-oriented learning objectives and educational content for the proposed principle-based approach to teaching AI ethics in medical education.
Collapse
Affiliation(s)
- Lukas Weidener
- UMIT TIROL - Private University for Health Sciences and Health Technology, Hall in Tirol, Austria
| | - Michael Fischer
- UMIT TIROL - Private University for Health Sciences and Health Technology, Hall in Tirol, Austria
| |
Collapse
|
128
|
Jing X, Cimino JJ, Patel VL, Zhou Y, Shubrook JH, Liu C, De Lacalle S. Data-Driven Hypothesis Generation in Clinical Research: What We Learned from a Human Subject Study? MEDICAL RESEARCH ARCHIVES 2024; 12:10.18103/mra.v12i2.5132. [PMID: 39211055 PMCID: PMC11361316 DOI: 10.18103/mra.v12i2.5132] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
Hypothesis generation is an early and critical step in any hypothesis-driven clinical research project. Because it is not yet a well-understood cognitive process, the need to improve the process goes unrecognized. Without an impactful hypothesis, the significance of any research project can be questionable, regardless of the rigor or diligence applied in other steps of the study, e.g., study design, data collection, and result analysis. In this perspective article, the authors provide a literature review on the following topics first: scientific thinking, reasoning, medical reasoning, literature-based discovery, and a field study to explore scientific thinking and discovery. Over the years, scientific thinking has shown excellent progress in cognitive science and its applied areas: education, medicine, and biomedical research. However, a review of the literature reveals the lack of original studies on hypothesis generation in clinical research. The authors then summarize their first human participant study exploring data-driven hypothesis generation by clinical researchers in a simulated setting. The results indicate that a secondary data analytical tool, VIADS-a visual interactive analytic tool for filtering, summarizing, and visualizing large health data sets coded with hierarchical terminologies, can shorten the time participants need, on average, to generate a hypothesis and also requires fewer cognitive events to generate each hypothesis. As a counterpoint, this exploration also indicates that the quality ratings of the hypotheses thus generated carry significantly lower ratings for feasibility when applying VIADS. Despite its small scale, the study confirmed the feasibility of conducting a human participant study directly to explore the hypothesis generation process in clinical research. This study provides supporting evidence to conduct a larger-scale study with a specifically designed tool to facilitate the hypothesis-generation process among inexperienced clinical researchers. A larger study could provide generalizable evidence, which in turn can potentially improve clinical research productivity and overall clinical research enterprise.
Collapse
Affiliation(s)
- Xia Jing
- Department of Public Health Sciences, College of Behavioral, Social and Health Sciences, Clemson University, Clemson, SC
| | - James J. Cimino
- Informatics Institute, School of Medicine, University of Alabama, Birmingham, Birmingham, AL
| | - Vimla L. Patel
- Cognitive Studies in Medicine and Public Health, The New York Academy of Medicine, New York City, NY
| | - Yuchun Zhou
- Department of Educational Studies, Patton College of Education, Ohio University, Athens, OH
| | - Jay H. Shubrook
- Department of Clinical Sciences and Community Health, Touro University California College of Osteopathic Medicine, Vallejo, CA
| | - Chang Liu
- Department of Electrical Engineering and Computer Science, Russ College of Engineering and Technology, Ohio University, Athens, OH
| | - Sonsoles De Lacalle
- Department of Health Science, California State University Channel Islands, Camarillo, CA
| |
Collapse
|
129
|
Khan MS, Umer H. ChatGPT in finance: Applications, challenges, and solutions. Heliyon 2024; 10:e24890. [PMID: 38304767 PMCID: PMC10831748 DOI: 10.1016/j.heliyon.2024.e24890] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 12/13/2023] [Accepted: 01/16/2024] [Indexed: 02/03/2024] Open
Abstract
The emergence of ChatGPT, a generative artificial intelligence tool, has sparked a revolution in the finance industry, enabling individuals to interact with technology in natural language. However, the use of ChatGPT in finance presents a profound array of ethical considerations that demand careful scrutiny to ensure its responsible and ethical use. After a concise exploration of ChatGPT's applications in finance, this policy article delves into the ethical challenges arising from the use of ChatGPT in finance, including outcomes contaminated with biases, incorporation of fake information in the financial decisions, concerns surrounding privacy and security, lack of transparency and accountability in the decision-making processes and financial services, human job displacement, and the intricate web of legal complexities. Our article asserts that financial institutions employing ChatGPT must proactively devise strategies to confront these burgeoning challenges, mitigating their adverse effects on both individuals and society as a whole. Additionally, we propose relevant policies to tackle these ethical quandaries head-on. In essence, this article illuminates the imperative need for a meticulous ethical framework, facilitating an informed and responsible use of ChatGPT in the realm of finance, safeguarding the welfare of individuals and society. While our work significantly contributes to the research and practice of finance, we also identify future research avenues.
Collapse
Affiliation(s)
| | - Hamza Umer
- Hitotsubashi Institute for Advanced Study (HIAS), Institute of Economic Research (IER), Hitotsubashi University, Japan
| |
Collapse
|
130
|
Andrew A. Potential applications and implications of large language models in primary care. Fam Med Community Health 2024; 12:e002602. [PMID: 38290759 PMCID: PMC10828839 DOI: 10.1136/fmch-2023-002602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Accepted: 01/16/2024] [Indexed: 02/01/2024] Open
Abstract
The recent release of highly advanced generative artificial intelligence (AI) chatbots, including ChatGPT and Bard, which are powered by large language models (LLMs), has attracted growing mainstream interest over its diverse applications in clinical practice, including in health and healthcare. The potential applications of LLM-based programmes in the medical field range from assisting medical practitioners in improving their clinical decision-making and streamlining administrative paperwork to empowering patients to take charge of their own health. However, despite the broad range of benefits, the use of such AI tools also comes with several limitations and ethical concerns that warrant further consideration, encompassing issues related to privacy, data bias, and the accuracy and reliability of information generated by AI. The focus of prior research has primarily centred on the broad applications of LLMs in medicine. To the author's knowledge, this is, the first article that consolidates current and pertinent literature on LLMs to examine its potential in primary care. The objectives of this paper are not only to summarise the potential benefits, risks and challenges of using LLMs in primary care, but also to offer insights into considerations that primary care clinicians should take into account when deciding to adopt and integrate such technologies into their clinical practice.
Collapse
Affiliation(s)
- Albert Andrew
- Medical Student, The University of Auckland School of Medicine, Auckland, New Zealand
| |
Collapse
|
131
|
Mediboina A, Badam RK, Chodavarapu S. Assessing the Accuracy of Information on Medication Abortion: A Comparative Analysis of ChatGPT and Google Bard AI. Cureus 2024; 16:e51544. [PMID: 38318564 PMCID: PMC10840059 DOI: 10.7759/cureus.51544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/01/2024] [Indexed: 02/07/2024] Open
Abstract
Background and objective ChatGPT and Google Bard AI are widely used conversational chatbots, even in healthcare. While they have several strengths, they can generate seemingly correct but erroneous responses, warranting caution in medical contexts. In an era where access to abortion care is diminishing, patients may increasingly rely on online resources and AI-driven language models for information on medication abortions. In light of this, this study aimed to compare the accuracy and comprehensiveness of responses generated by ChatGPT 3.5 and Google Bard AI to medical queries about medication abortions. Methods Fourteen open-ended questions about medication abortion were formulated based on the Frequently Asked Questions (FAQs) from the National Abortion Federation (NAF) and the Reproductive Health Access Project (RHAP) websites. These questions were answered using ChatGPT version 3.5 and Google Bard AI on October 7, 2023. The accuracy of the responses was analyzed by cross-referencing the generated answers against the information provided by NAF and RHAP. Any discrepancies were further verified against the guidelines from the American Congress of Obstetricians and Gynecologists (ACOG). A rating scale used by Johnson et al. was employed for assessment, utilizing a 6-point Likert scale [ranging from 1 (completely incorrect) to 6 (correct)] to evaluate accuracy and a 3-point scale [ranging from 1 (incomplete) to 3 (comprehensive)] to assess completeness. Questions that did not yield answers were assigned a score of 0 and omitted from the correlation analysis. Data analysis and visualization were done using R Software version 4.3.1. Statistical significance was determined by employing Spearman's R and Mann-Whitney U tests. Results All questions were entered sequentially into both chatbots by the same author. On the initial attempt, ChatGPT successfully generated relevant responses for all questions, while Google Bard AI failed to provide answers for five questions. Repeating the same question in Google Bard AI yielded an answer for one; two were answered with different phrasing; and two remained unanswered despite rephrasing. ChatGPT showed a median accuracy score of 5 (mean: 5.26, SD: 0.73) and a median completeness score of 3 (mean: 2.57, SD: 0.51). It showed the highest accuracy score in six responses and the highest completeness score in eight responses. In contrast, Google Bard AI had a median accuracy score of 5 (mean: 4.5, SD: 2.03) and a median completeness score of 2 (mean: 2.14, SD: 1.03). It achieved the highest accuracy score in five responses and the highest completeness score in six responses. Spearman's correlation coefficient revealed no correlation between accuracy and completeness for ChatGPT (rs = -0.46771, p = 0.09171). However, Google Bard AI showed a marginally significant correlation (rs = 0.5738, p = 0.05108). Mann-Whitney U test indicated no statistically significant differences between ChatGPT and Google Bard AI concerning accuracy (U = 82, p>0.05) or completeness (U = 78, p>0.05). Conclusion While both chatbots showed similar levels of accuracy, minor errors were noted, pertaining to finer aspects that demand specialized knowledge of abortion care. This could explain the lack of a significant correlation between accuracy and completeness. Ultimately, AI-driven language models have the potential to provide information on medication abortions, but there is a need for continual refinement and oversight.
Collapse
Affiliation(s)
- Anjali Mediboina
- Community Medicine, Alluri Sita Ramaraju Academy of Medical Sciences, Eluru, IND
| | - Rajani Kumari Badam
- Obstetrics and Gynaecology, Sri Venkateswara Medical College, Tirupathi, IND
| | - Sailaja Chodavarapu
- Obstetrics and Gynaecology, Government Medical College, Rajamahendravaram, IND
| |
Collapse
|
132
|
Zhang JS, Yoon C, Williams DKA, Pinkas A. Exploring the Usage of ChatGPT Among Medical Students in the United States. JOURNAL OF MEDICAL EDUCATION AND CURRICULAR DEVELOPMENT 2024; 11:23821205241264695. [PMID: 39092290 PMCID: PMC11292693 DOI: 10.1177/23821205241264695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Accepted: 06/08/2024] [Indexed: 08/04/2024]
Abstract
OBJECTIVES Chat Generative Pretrained Transformer (ChatGPT) is a large language model developed by OpenAI that has gained widespread interest. It has been cited for its potential impact on health care and its beneficial role in medical education. However, there is limited investigation into its use among medical students. In this study, we evaluated the frequency of ChatGPT use, motivations for use, and preference for ChatGPT over existing resources among medical students in the United States. METHODS Data was collected from an original survey consisting of 14 questions assessing the frequency and usage of ChatGPT in various contexts within medical education. The survey was distributed via email lists, group messaging applications, and classroom lectures to medical students across the United States. Responses were collected between August and October 2023. RESULTS One hundred thirty-one participants completed the survey and were included in the analysis. Of the total, 48.9% respondents responded that they have used ChatGPT in medical studies. Among ChatGPT users, 43.7% of respondents report using ChatGPT weekly, several times per week, or daily. ChatGPT is most used for writing, revising, editing, and summarizing purposes. 37.5% and 41.3% of respondents reported using ChatGPT more than 25% of the working time for these tasks respectively. Among respondents who have not used ChatGPT, more than 50% of respondents reported they were extremely unlikely or unlikely to use ChatGPT across all surveyed scenarios. ChatGPT users report they are more likely to use ChatGPT over directly asking professors or attendings (45.3%), textbooks (42.2%), and lectures (31.7%), and least likely to be used over popular flashcard application Anki (11.1%) and medical education videos (9.5%). CONCLUSIONS ChatGPT is an increasingly popular resource among medical students, with many preferring ChatGPT over other traditional resources such as professors, textbooks, and lectures. Its impact on medical education will only continue to grow as its capabilities improve.
Collapse
Affiliation(s)
| | - Christine Yoon
- Albert Einstein College of Medicine, Bronx, New York, USA
| | | | - Adi Pinkas
- Albert Einstein College of Medicine, Bronx, New York, USA
| |
Collapse
|
133
|
Abdulnazar A, Roller R, Schulz S, Kreuzthaler M. Unsupervised SapBERT-based bi-encoders for medical concept annotation of clinical narratives with SNOMED CT. Digit Health 2024; 10:20552076241288681. [PMID: 39493636 PMCID: PMC11531008 DOI: 10.1177/20552076241288681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Accepted: 09/03/2024] [Indexed: 11/05/2024] Open
Abstract
Objective Clinical narratives provide comprehensive patient information. Achieving interoperability involves mapping relevant details to standardized medical vocabularies. Typically, natural language processing divides this task into named entity recognition (NER) and medical concept normalization (MCN). State-of-the-art results require supervised setups with abundant training data. However, the limited availability of annotated data due to sensitivity and time constraints poses challenges. This study addressed the need for unsupervised medical concept annotation (MCA) to overcome these limitations and support the creation of annotated datasets. Method We use an unsupervised SapBERT-based bi-encoder model to analyze n-grams from narrative text and measure their similarity to SNOMED CT concepts. At the end, we apply a syntactical re-ranker. For evaluation, we use the semantic tags of SNOMED CT candidates to assess the NER phase and their concept IDs to assess the MCN phase. The approach is evaluated with both English and German narratives. Result Without training data, our unsupervised approach achieves an F1 score of 0.765 in English and 0.557 in German for MCN. Evaluation at the semantic tag level reveals that "disorder" has the highest F1 scores, 0.871 and 0.648 on English and German datasets. Furthermore, the MCA approach on the semantic tag "disorder" shows F1 scores of 0.839 and 0.696 in English and 0.685 and 0.437 in German for NER and MCN, respectively. Conclusion This unsupervised approach demonstrates potential for initial annotation (pre-labeling) in manual annotation tasks. While promising for certain semantic tags, challenges remain, including false positives, contextual errors, and variability of clinical language, requiring further fine-tuning.
Collapse
Affiliation(s)
- Akhila Abdulnazar
- Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Austria
- CBmed GmbH – Center for Biomarker Research in Medicine, Graz, Austria
| | - Roland Roller
- German Research Center for Artificial Intelligence (DFKI), Berlin, Germany
| | - Stefan Schulz
- Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Austria
| | - Markus Kreuzthaler
- Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Austria
| |
Collapse
|
134
|
Holodinsky JK, Wrenn JO, Trivedi S, Hess E, Lang E. When you have a hammer, everything looks like a nail: but what kind of hammer is ChatGPT? CAN J EMERG MED 2024; 26:1-2. [PMID: 38194060 DOI: 10.1007/s43678-023-00631-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2024]
Affiliation(s)
- Jessalyn K Holodinsky
- Department of Emergency Medicine, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
- Center for Health Informatics, Cumming School of Medicine, University of Calgary, CWPH 5E36, 3280 Hospital Drive NW, Calgary, AB, T2N 4Z6, Canada.
- Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
- O'Brien Institute for Public Health, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
- Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
| | - Jesse O Wrenn
- Department of Emergency Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Division of Emergency Medicine, Tennessee Valley Healthcare System VA, Nashville, TN, USA
| | - Sachin Trivedi
- Department of Emergency Medicine, University of Saskatchewan, Saskatoon, SK, Canada
| | - Erik Hess
- Department of Emergency Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Eddy Lang
- Department of Emergency Medicine, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| |
Collapse
|
135
|
Xu S, Chen S, Chen M. Prudent Promotion, Steady Development: Capability and Safety Considerations for Applying Large Language Models in Medicine. COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE 2024:110-123. [DOI: 10.1007/978-981-97-1280-9_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2025]
|
136
|
Shojaei P, Khosravi M, Jafari Y, Mahmoudi AH, Hassanipourmahani H. ChatGPT utilization within the building blocks of the healthcare services: A mixed-methods study. Digit Health 2024; 10:20552076241297059. [PMID: 39559384 PMCID: PMC11571260 DOI: 10.1177/20552076241297059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Accepted: 10/17/2024] [Indexed: 11/20/2024] Open
Abstract
Introduction ChatGPT, as an AI tool, has been introduced in healthcare for various purposes. The objective of the study was to investigate the principal benefits of ChatGPT utilization in healthcare services and to identify potential domains for its expansion within the building blocks of the healthcare industry. Methods A comprehensive three-phase study was conducted employing mixed methods. The initial phase comprised a systematic review and thematic analysis of the data. In the subsequent phases, a questionnaire, developed based on the findings from the first phase, was distributed to a sample of eight experts. The objective was to prioritize the benefits and potential expansion domains of ChatGPT in healthcare building blocks, utilizing gray SWARA (Stepwise Weight Assessment Ratio Analysis) and gray MABAC (Multi-Attributive Border Approximation Area Comparison), respectively. Results The systematic review yielded 74 studies. A thematic analysis of the data from these studies identified 11 unique themes. In the second phase, employing the gray SWARA method, clinical decision-making (weight: 0.135), medical diagnosis (weight: 0.098), medical procedures (weight: 0.070), and patient-centered care (weight: 0.053) emerged as the most significant benefit of ChatGPT in the healthcare sector. Subsequently, it was determined that ChatGPT demonstrated the highest level of usefulness in the information and infrastructure, information and communication technologies blocks. Conclusion The study concluded that, despite the significant benefits of ChatGPT in the clinical domains of healthcare, it exhibits a more pronounced potential for growth within the informational domains of the healthcare industry's building blocks, rather than within the domains of intervention and clinical services.
Collapse
Affiliation(s)
- Payam Shojaei
- Department of Management, Shiraz University, Shiraz, Iran
| | - Mohsen Khosravi
- Department of Healthcare Management, School of Management and Medical Informatics, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Yalda Jafari
- Department of Management, Shiraz University, Shiraz, Iran
| | - Amir Hossein Mahmoudi
- Department of Operations Management & Decision Sciences, Faculty of Management, University of Tehran, Tehran, Iran
| | - Hadis Hassanipourmahani
- Department of Information Technology Management, Faculty of Management, University of Tehran, Tehran, Iran
| |
Collapse
|
137
|
Sahu PK, Benjamin LA, Singh Aswal G, Williams-Persad A. ChatGPT in research and health professions education: challenges, opportunities, and future directions. Postgrad Med J 2023; 100:50-55. [PMID: 37819738 DOI: 10.1093/postmj/qgad090] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 09/15/2023] [Indexed: 10/13/2023]
Abstract
ChatGPT was launched by OpenAI in November 2022 and within 2 months it became popular across a wide range of industrial, social, and intellectual contexts including healthcare education. This article reviews the impact of ChatGPT on research and health professions education by identifying the challenges and opportunities in these fields. Additionally, it aims to provide future directions to mitigate the challenges and maximize the benefits of this technology in health professions education. ChatGPT has the potential to revolutionize the field of research and health professions education. However, there is a need to address ethical concerns and limitations such as lack of real-time data, data inaccuracies, biases, plagiarism, and copyright infringement before its implementation. Future research can highlight the ways to mitigate these challenges; establish guidelines and policies; and explore how effectively ChatGPT and other AI tools can be used in the field of research and healthcare professions education.
Collapse
Affiliation(s)
- Pradeep Kumar Sahu
- Centre For Medical Sciences Education, Faculty of Medical Sciences, The University of the West Indies, St Augustine, Trinidad and Tobago
| | - Lisa A Benjamin
- Department of Basic Veterinary Sciences, School of Veterinary Medicine, Faculty of Medical Sciences, The University of the West Indies, St Augustine, Trinidad and Tobago
| | - Gunjan Singh Aswal
- Department of Restorative Dentistry, School of Dentistry, Faculty of Medical Sciences, The University of the West Indies, St Augustine Trinidad and Tobago
| | - Arlene Williams-Persad
- Department of Paraclinical Sciences, Faculty of Medical Sciences, The University of the West Indies St. Augustine, Trinidad and Tobago West Indies
| |
Collapse
|
138
|
Lu J, Gao W, Wang Z, Yang N, Pang WIP, In Lok GK, Rao W. Psychosocial interventions for suicidal and self-injurious-related behaviors among adolescents: a systematic review and meta-analysis of Chinese practices. Front Public Health 2023; 11:1281696. [PMID: 38164448 PMCID: PMC10757980 DOI: 10.3389/fpubh.2023.1281696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Accepted: 11/17/2023] [Indexed: 01/03/2024] Open
Abstract
Background Suicidal and self-injurious-related behaviors (SSIRBs) are a serious public health challenge in China. However, a comprehensive systematic review of psychosocial interventions for SSIRBs among Chinese adolescents has not been performed. To fill this gap, this systematic review and meta-analysis aimed to examine psychosocial interventions for SSIRBs among Chinese adolescents. Methods Eight international (PubMed, EMBASE, Cochrane Library, ScienceDirect, Clinical Trial, CINAHL, PsycINFO, and Web of Science) and four Chinese (Wanfang, SinoMed, CEPS, and CNKI) databases were searched from inception to 31 January 2023. Data extraction and quality assessment were independently conducted by two groups of researchers. Qualitative synthesis and meta-analysis were both used. Results The initial search yielded 16,872 titles. Of the 649 full texts reviewed, 19 intervention articles focusing on SSIRBs met the inclusion criteria. Thirteen out of the 19 included studies involved cognitive-behavioral therapy (CBT). Seven non-suicidal self-injury (NSSI) studies assessing self-injurious behaviors were included (six short-term studies and three long-term studies). Compared with long-term interventions [-1.30 (95% CI: -1.84, -0.76)], short-term psychosocial interventions had a higher standardized mean difference (SMD) value [1.86 (95% CI: -2.72, -0.99)]. Meta-regression showed an inverse relationship between the treatment response and sample size (slope = 0.068, Z = 2.914, p = 0.004) and proportion of females (slope = 1.096, Z = 5.848, p < 0.001). Subgroup analyses showed that compared with the "less than 1 month" group [-0.494 (-0.783, -0.205)], in the "immediate postintervention" group, the pooled estimate was significantly lower [-2.800 (-4.050, -1.550), p < 0.001]. Conclusion Our review systematically summarized the key characteristics and effectiveness of existing psychosocial interventions for SSIRBs among Chinese adolescents. Short-term psychosocial interventions for NSSI were significantly effective in reducing self-injurious behavior scores, especially in the immediate postintervention period. More favorable treatment responses could be observed in both male and small samples.
Collapse
Affiliation(s)
- Junjie Lu
- Department of Preventive Medicine, Shantou University Medical College, Shantou, Guangdong, China
| | - Wanting Gao
- Department of Preventive Medicine, Shantou University Medical College, Shantou, Guangdong, China
| | - Zexin Wang
- Faculty of Health Sciences and Sports, Macao Polytechnic University, Macao, Macao SAR, China
| | - Nan Yang
- Faculty of Health Sciences and Sports, Macao Polytechnic University, Macao, Macao SAR, China
| | - Weng Ian Phoenix Pang
- Faculty of Health Sciences and Sports, Macao Polytechnic University, Macao, Macao SAR, China
| | - Grace Ka In Lok
- Macao Polytechnic University, Peking University Health Science Center-Macao Polytechnic University Nursing Academy, Macao, Macao SAR, China
| | - Wenwang Rao
- Department of Preventive Medicine, Shantou University Medical College, Shantou, Guangdong, China
| |
Collapse
|
139
|
Madrid-García A, Rosales-Rosado Z, Freites-Nuñez D, Pérez-Sancristóbal I, Pato-Cour E, Plasencia-Rodríguez C, Cabeza-Osorio L, Abasolo-Alcázar L, León-Mateos L, Fernández-Gutiérrez B, Rodríguez-Rodríguez L. Harnessing ChatGPT and GPT-4 for evaluating the rheumatology questions of the Spanish access exam to specialized medical training. Sci Rep 2023; 13:22129. [PMID: 38092821 PMCID: PMC10719375 DOI: 10.1038/s41598-023-49483-6] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 12/08/2023] [Indexed: 12/17/2023] Open
Abstract
The emergence of large language models (LLM) with remarkable performance such as ChatGPT and GPT-4, has led to an unprecedented uptake in the population. One of their most promising and studied applications concerns education due to their ability to understand and generate human-like text, creating a multitude of opportunities for enhancing educational practices and outcomes. The objective of this study is twofold: to assess the accuracy of ChatGPT/GPT-4 in answering rheumatology questions from the access exam to specialized medical training in Spain (MIR), and to evaluate the medical reasoning followed by these LLM to answer those questions. A dataset, RheumaMIR, of 145 rheumatology-related questions, extracted from the exams held between 2010 and 2023, was created for that purpose, used as a prompt for the LLM, and was publicly distributed. Six rheumatologists with clinical and teaching experience evaluated the clinical reasoning of the chatbots using a 5-point Likert scale and their degree of agreement was analyzed. The association between variables that could influence the models' accuracy (i.e., year of the exam question, disease addressed, type of question and genre) was studied. ChatGPT demonstrated a high level of performance in both accuracy, 66.43%, and clinical reasoning, median (Q1-Q3), 4.5 (2.33-4.67). However, GPT-4 showed better performance with an accuracy score of 93.71% and a median clinical reasoning value of 4.67 (4.5-4.83). These findings suggest that LLM may serve as valuable tools in rheumatology education, aiding in exam preparation and supplementing traditional teaching methods.
Collapse
Affiliation(s)
- Alfredo Madrid-García
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain.
| | - Zulema Rosales-Rosado
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
| | - Dalifer Freites-Nuñez
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
| | - Inés Pérez-Sancristóbal
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
| | - Esperanza Pato-Cour
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
| | | | - Luis Cabeza-Osorio
- Medicina Interna, Hospital Universitario del Henares, Avenida de Marie Curie, 0, 28822, Madrid, Spain
- Facultad de Medicina, Universidad Francisco de Vitoria, Carretera Pozuelo, Km 1800, 28223, Madrid, Spain
| | - Lydia Abasolo-Alcázar
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
| | - Leticia León-Mateos
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
| | - Benjamín Fernández-Gutiérrez
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
- Facultad de Medicina, Universidad Complutense de Madrid, Madrid, Spain
| | - Luis Rodríguez-Rodríguez
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
| |
Collapse
|
140
|
Huo B, Cacciamani GE, Collins GS, McKechnie T, Lee Y, Guyatt G. Reporting standards for the use of large language model-linked chatbots for health advice. Nat Med 2023; 29:2988. [PMID: 37957381 DOI: 10.1038/s41591-023-02656-2] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Affiliation(s)
- Bright Huo
- Division of General Surgery, Department of Surgery, McMaster University, Hamilton, Ontario, Canada.
| | - Giovanni E Cacciamani
- USC Institute of Urology and Catherine and Joseph Aresty Department of Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- AI Center at USC Urology, University of Southern California, Los Angeles, CA, USA
| | - Gary S Collins
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology & Musculoskeletal Sciences, University of Oxford, Oxford, UK
- UK EQUATOR Centre, University of Oxford, Oxford, UK
| | - Tyler McKechnie
- Division of General Surgery, Department of Surgery, McMaster University, Hamilton, Ontario, Canada
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
| | - Yung Lee
- Division of General Surgery, Department of Surgery, McMaster University, Hamilton, Ontario, Canada
- Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
| | - Gordon Guyatt
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada
- Department of Medicine, McMaster University, Hamilton, Ontario, Canada
| |
Collapse
|
141
|
Ray PP. Generative Artificial Intelligence (AI) and Medical Ethics: A Symbiotic Dance for the Future. J Oral Maxillofac Surg 2023; 81:1457-1459. [PMID: 38044013 DOI: 10.1016/j.joms.2023.09.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 09/06/2023] [Indexed: 12/05/2023]
Affiliation(s)
- Partha Pratim Ray
- Assistant Professor, Department of Computer Applications, Sikkim University, Gangtok, India.
| |
Collapse
|
142
|
Shaban-Nejad A, Michalowski M, Bianco S. Creative and generative artificial intelligence for personalized medicine and healthcare: Hype, reality, or hyperreality? Exp Biol Med (Maywood) 2023; 248:2497-2499. [PMID: 38311873 PMCID: PMC10854468 DOI: 10.1177/15353702241226801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2024] Open
Affiliation(s)
- Arash Shaban-Nejad
- UTHSC-ORNL Center for Biomedical Informatics and Department of Pediatrics, College of Medicine, The University of Tennessee Health Science Center, Memphis, TN 38163, USA
| | - Martin Michalowski
- School of Nursing, University of Minnesota—Twin Cities, Minneapolis, MN 55455, USA
| | - Simone Bianco
- Altos Labs Bay Area Institute of Science, Redwood City, CA 94065, USA
| |
Collapse
|
143
|
Chlorogiannis DD, Apostolos A, Chlorogiannis A, Palaiodimos L, Giannakoulas G, Pargaonkar S, Xesfingi S, Kokkinidis DG. The Role of ChatGPT in the Advancement of Diagnosis, Management, and Prognosis of Cardiovascular and Cerebrovascular Disease. Healthcare (Basel) 2023; 11:2906. [PMID: 37958050 PMCID: PMC10648908 DOI: 10.3390/healthcare11212906] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 10/24/2023] [Accepted: 11/04/2023] [Indexed: 11/15/2023] Open
Abstract
Cardiovascular and cerebrovascular disease incidence has risen mainly due to poor control of preventable risk factors and still constitutes a significant financial and health burden worldwide. ChatGPT is an artificial intelligence language-based model developed by OpenAI. Due to the model's unique cognitive capabilities beyond data processing and the production of high-quality text, there has been a surge of research interest concerning its role in the scientific community and contemporary clinical practice. To fully exploit ChatGPT's potential benefits and reduce its possible misuse, extreme caution must be taken to ensure its implications ethically and equitably. In this narrative review, we explore the language model's possible applications and limitations while emphasizing its potential value for diagnosing, managing, and prognosis of cardiovascular and cerebrovascular disease.
Collapse
Affiliation(s)
| | - Anastasios Apostolos
- First Department of Cardiology, School of Medicine, National Kapodistrian University of Athens, Hippokrateion General Hospital of Athens, 115 27 Athens, Greece;
| | - Anargyros Chlorogiannis
- Department of Health Economics, Policy and Management, Karolinska Institutet, 171 77 Stockholm, Sweden
| | - Leonidas Palaiodimos
- Division of Hospital Medicine, Jacobi Medical Center, NYC H+H, Albert Einstein College of Medicine, New York, NY 10461, USA; (L.P.); (S.P.)
| | - George Giannakoulas
- Department of Cardiology, AHEPA University Hospital, Aristotle University of Thessaloniki, 541 24 Thessaloniki, Greece;
| | - Sumant Pargaonkar
- Division of Hospital Medicine, Jacobi Medical Center, NYC H+H, Albert Einstein College of Medicine, New York, NY 10461, USA; (L.P.); (S.P.)
| | - Sofia Xesfingi
- Department of Economics, University of Piraeus, 185 34 Piraeus, Greece
| | - Damianos G. Kokkinidis
- Section of Cardiovascular Medicine, Yale University School of Medicine, New Haven, CT 06510, USA
| |
Collapse
|
144
|
Liu J, Liu F, Fang J, Liu S. The application of Chat Generative Pre-trained Transformer in nursing education. Nurs Outlook 2023; 71:102064. [PMID: 37879261 DOI: 10.1016/j.outlook.2023.102064] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 09/12/2023] [Accepted: 09/29/2023] [Indexed: 10/27/2023]
Abstract
BACKGROUND Nursing education is critical for nurses to deliver quality health care. Incorporating AI into education can enhance the learning process and better equip nurses for their health care roles. PURPOSE This article explores the potential applications and challenges of ChatGPT in nursing education. METHODS A comprehensive literature review was conducted to explore the potential benefits and challenges of using ChatGPT in nursing education. DISCUSSION ChatGPT, an advanced large language model, has the potential to make valuable contributions to nursing education in various ways, including personalized learning, simulation scenarios, immediate feedback, and reducing educator workload. However, it is important to address the various challenges and limitations in order to realize its full potential. CONCLUSION Nursing educators must carefully consider the potential uses, benefits, challenges, drawbacks, and limitations of ChatGPT to make informed decisions about its integration into nursing education.
Collapse
Affiliation(s)
- Jialin Liu
- Department of Medical Informatics, West China Medical School, Chengdu, Sichuan, China; Department of Otolaryngology-Head and Neck Surgery, West China Hospital, Sichuan University, Chengdu, Sichuan, China.
| | - Fan Liu
- Department of Nursing, West China Hospital of Stomatology, Sichuan University, Chengdu, Sichuan, China
| | - Jinbo Fang
- Department of Nursing, West China Hospital, West China School of Nursing, Sichuan University, Chengdu, Sichuan, China
| | - Siru Liu
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN
| |
Collapse
|
145
|
van Manen M. What Does ChatGPT Mean for Qualitative Health Research? QUALITATIVE HEALTH RESEARCH 2023; 33:1135-1139. [PMID: 37897694 DOI: 10.1177/10497323231210816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/30/2023]
Affiliation(s)
- Michael van Manen
- Department of Pediatrics, University of Alberta, Edmonton, AB, Canada
- John Dossetor Health Ethics Centre, University of Alberta, Edmonton, AB, Canada
| |
Collapse
|
146
|
Dhane AS. Comment on "Role of AI-based ChatGPT in oral and maxillofacial surgery: A friend or foe?". Oral Oncol 2023; 146:106561. [PMID: 37619522 DOI: 10.1016/j.oraloncology.2023.106561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Accepted: 08/17/2023] [Indexed: 08/26/2023]
Affiliation(s)
- Amol S Dhane
- Research & Development Cell, Dr. D.Y. Patil Vidyapeeth, Sant-Tukaram Nagar, Pimpri, Pune 411018, MH, India.
| |
Collapse
|
147
|
Tong W, Guan Y, Chen J, Huang X, Zhong Y, Zhang C, Zhang H. Artificial intelligence in global health equity: an evaluation and discussion on the application of ChatGPT, in the Chinese National Medical Licensing Examination. Front Med (Lausanne) 2023; 10:1237432. [PMID: 38020160 PMCID: PMC10656681 DOI: 10.3389/fmed.2023.1237432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 10/09/2023] [Indexed: 12/01/2023] Open
Abstract
Background The demand for healthcare is increasing globally, with notable disparities in access to resources, especially in Asia, Africa, and Latin America. The rapid development of Artificial Intelligence (AI) technologies, such as OpenAI's ChatGPT, has shown promise in revolutionizing healthcare. However, potential challenges, including the need for specialized medical training, privacy concerns, and language bias, require attention. Methods To assess the applicability and limitations of ChatGPT in Chinese and English settings, we designed an experiment evaluating its performance in the 2022 National Medical Licensing Examination (NMLE) in China. For a standardized evaluation, we used the comprehensive written part of the NMLE, translated into English by a bilingual expert. All questions were input into ChatGPT, which provided answers and reasons for choosing them. Responses were evaluated for "information quality" using the Likert scale. Results ChatGPT demonstrated a correct response rate of 81.25% for Chinese and 86.25% for English questions. Logistic regression analysis showed that neither the difficulty nor the subject matter of the questions was a significant factor in AI errors. The Brier Scores, indicating predictive accuracy, were 0.19 for Chinese and 0.14 for English, indicating good predictive performance. The average quality score for English responses was excellent (4.43 point), slightly higher than for Chinese (4.34 point). Conclusion While AI language models like ChatGPT show promise for global healthcare, language bias is a key challenge. Ensuring that such technologies are robustly trained and sensitive to multiple languages and cultures is vital. Further research into AI's role in healthcare, particularly in areas with limited resources, is warranted.
Collapse
Affiliation(s)
- Wenting Tong
- Department of Pharmacy, Gannan Healthcare Vocational College, Ganzhou, Jiangxi, China
| | - Yongfu Guan
- Department of Rehabilitation and Elderly Care, Gannan Healthcare Vocational College, Ganzhou, Jiangxi, China
| | - Jinping Chen
- Department of Rehabilitation and Elderly Care, Gannan Healthcare Vocational College, Ganzhou, Jiangxi, China
| | - Xixuan Huang
- Department of Mathematics, Xiamen University, Xiamen, Fujian, China
| | - Yuting Zhong
- Department of Anesthesiology, Gannan Medical University, Jiangxi, China
| | - Changrong Zhang
- Department of Chinese Medicine, Affiliated Hospital of Qinghai University, Xining, Qinghai, China
| | - Hui Zhang
- Department of Rehabilitation and Elderly Care, Gannan Healthcare Vocational College, Ganzhou, Jiangxi, China
- Chair of Endocrinology and Medical Sexology (ENDOSEX), Department of Experimental Medicine, University of Rome Tor Vergata, Rome, Italy
| |
Collapse
|