1
|
Garcia Garcia L, Emile SH, Linkeshwaran L, Wignakumar A, Wexner SD. A literature review on the role of artificial intelligence-based chatbots in patient education in colorectal surgery. Surgery 2025; 183:109393. [PMID: 40347684 DOI: 10.1016/j.surg.2025.109393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2025] [Revised: 02/26/2025] [Accepted: 03/31/2025] [Indexed: 05/14/2025]
Abstract
INTRODUCTION Artificial intelligence-based chatbots are becoming increasingly used in patient education, in the realm of colorectal diseases. Perhaps, not surprisingly, concerns about the appropriateness of chatbot answers have been raised by healthcare professionals. Numerous studies have explored the utility and accuracy of chatbots in providing information in several clinical disciplines. This review aimed to summarize the findings of published studies, highlighting the strengths and limitations of chatbots used in medical education for colorectal surgery. METHODS We searched MEDLINE via PubMed and Scopus in February 2025 for original articles evaluating artificial intelligence-based chatbots in patient education related to colorectal surgery, categorizing them into 3 groups: colorectal cancer, inflammatory bowel diseases, and other colorectal conditions. RESULTS We identified 15 studies, 9 assessed chatbot utility in patient education in colorectal cancer, 4 assessed their utility in inflammatory bowel diseases, 1 involved benign anal conditions, and another involved intestinal stomas. Our findings indicated that chatbots, particularly ChatGPT, can improve patient education by providing accessible information on common questions. However, we also identified several limitations of the ability of chatbots to address complex medical issues which underscored that these tools may complement rather than replace professional medical guidance. CONCLUSION Chatbots may be useful for patient education related to simple and basic information, but not in complex and patient-specific settings. Future research should focus on refining chatbot algorithms to enhance the accuracy and depth of their responses, ensuring they effectively support patient education while maintaining the crucial role of healthcare providers.
Collapse
Affiliation(s)
- Laura Garcia Garcia
- Servicio de Cirugía General y Digestiva. Complejo Hospitalario Universitario Materno Infantil de Gran Canaria, Las Palmas de Gran Canaria, Spain
| | - Sameh Hany Emile
- Ellen Leifer Shulman and Steven Shulman Digestive Disease Center, Cleveland Clinic Florida, Weston, FL; Colorectal Surgery Unit, General Surgery Department, Mansoura University Hospitals, Mansoura, Egypt. https://twitter.com/dr_samehhany81
| | | | - Anjelli Wignakumar
- Ellen Leifer Shulman and Steven Shulman Digestive Disease Center, Cleveland Clinic Florida, Weston, FL. https://twitter.com/AWignakumar
| | - Steven D Wexner
- Ellen Leifer Shulman and Steven Shulman Digestive Disease Center, Cleveland Clinic Florida, Weston, FL.
| |
Collapse
|
2
|
Godwin RC, Tung A, Berkowitz DE, Melvin RL. Transforming Physiology and Healthcare through Foundation Models. Physiology (Bethesda) 2025; 40:0. [PMID: 39832521 DOI: 10.1152/physiol.00048.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2024] [Revised: 11/30/2024] [Accepted: 01/08/2025] [Indexed: 01/22/2025] Open
Abstract
Recent developments in artificial intelligence (AI) may significantly alter physiological research and healthcare delivery. Whereas AI applications in medicine have historically been trained for specific tasks, recent technological advances have produced models trained on more diverse datasets with much higher parameter counts. These new, "foundation" models raise the possibility that more flexible AI tools can be applied to a wider set of healthcare tasks than in the past. This review describes how these newer models differ from conventional task-specific AI, which relies heavily on focused datasets and narrow, specific applications. By examining the integration of AI into diagnostic tools, personalized treatment strategies, biomedical research, and healthcare administration, we highlight how these newer models are revolutionizing predictive healthcare analytics and operational workflows. In addition, we address ethical and practical considerations associated with the use of foundation models by highlighting emerging trends, calling for changes to existing guidelines, and emphasizing the importance of aligning AI with clinical goals to ensure its responsible and effective use.
Collapse
Affiliation(s)
- Ryan C Godwin
- Department of Anesthesiology and Perioperative Medicine, University of Alabama at Birmingham, Birmingham, Alabama, United States
| | - Avery Tung
- Department of Anesthesia and Critical Care, University of Chicago, Chicago, Illinois, United States
| | - Dan E Berkowitz
- Department of Anesthesiology and Perioperative Medicine, University of Alabama at Birmingham, Birmingham, Alabama, United States
| | - Ryan L Melvin
- Department of Anesthesiology and Perioperative Medicine, University of Alabama at Birmingham, Birmingham, Alabama, United States
| |
Collapse
|
3
|
Raman R. Transparency in research: An analysis of ChatGPT usage acknowledgment by authors across disciplines and geographies. Account Res 2025; 32:277-298. [PMID: 37877216 DOI: 10.1080/08989621.2023.2273377] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 10/17/2023] [Indexed: 10/26/2023]
Abstract
This investigation systematically reviews the recognition of generative AI tools, particularly ChatGPT, in scholarly literature. Utilizing 1,226 publications from the Dimensions database, ranging from November 2022 to July 2023, the research scrutinizes temporal trends and distribution across disciplines and regions. U.S.-based authors lead in acknowledgments, with notable contributions from China and India. Predominantly, Biomedical and Clinical Sciences, as well as Information and Computing Sciences, are engaging with these AI tools. Publications like "The Lancet Digital Health" and platforms such as "bioRxiv" are recurrent venues for such acknowledgments, highlighting AI's growing impact on research dissemination. The analysis is confined to the Dimensions database, thus potentially overlooking other sources and grey literature. Additionally, the study abstains from examining the acknowledgments' quality or ethical considerations. Findings are beneficial for stakeholders, providing a basis for policy and scholarly discourse on ethical AI use in academia. This study represents the inaugural comprehensive empirical assessment of AI acknowledgment patterns in academic contexts, addressing a previously unexplored aspect of scholarly communication.
Collapse
Affiliation(s)
- Raghu Raman
- Amrita School of Business, Amrita Vishwa Vidyapeetham, Amritapuri, Kerala, India
| |
Collapse
|
4
|
Nasef H, Patel H, Amin Q, Baum S, Ratnasekera A, Ang D, Havron WS, Nakayama D, Elkbuli A. Evaluating the Accuracy, Comprehensiveness, and Validity of ChatGPT Compared to Evidence-Based Sources Regarding Common Surgical Conditions: Surgeons' Perspectives. Am Surg 2025; 91:325-335. [PMID: 38794965 DOI: 10.1177/00031348241256075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/27/2024]
Abstract
BackgroundThis study aims to assess the accuracy, comprehensiveness, and validity of ChatGPT compared to evidence-based sources regarding the diagnosis and management of common surgical conditions by surveying the perceptions of U.S. board-certified practicing surgeons.MethodsAn anonymous cross-sectional survey was distributed to U.S. practicing surgeons from June 2023 to March 2024. The survey comprised 94 multiple-choice questions evaluating diagnostic and management information for five common surgical conditions from evidence-based sources or generated by ChatGPT. Statistical analysis included descriptive statistics and paired-sample t-tests.ResultsParticipating surgeons were primarily aged 40-50 years (43%), male (86%), White (57%), and had 5-10 years or >15 years of experience (86%). The majority of surgeons had no prior experience with ChatGPT in surgical practice (86%). For material discussing both acute cholecystitis and upper gastrointestinal hemorrhage, evidence-based sources were rated as significantly more comprehensive (3.57 (±.535) vs 2.00 (±1.16), P = .025) (4.14 (±.69) vs 2.43 (±.98), P < .001) and valid (3.71 (±.488) vs 2.86 (±1.07), P = .045) (3.71 (±.76) vs 2.71 (±.95) P = .038) than ChatGPT. However, there was no significant difference in accuracy between the two sources (3.71 vs 3.29, P = .289) (3.57 vs 2.71, P = .111).ConclusionSurveyed U.S. board-certified practicing surgeons rated evidence-based sources as significantly more comprehensive and valid compared to ChatGPT across the majority of surveyed surgical conditions. However, there was no significant difference in accuracy between the sources across the majority of surveyed conditions. While ChatGPT may offer potential benefits in surgical practice, further refinement and validation are necessary to enhance its utility and acceptance among surgeons.
Collapse
Affiliation(s)
- Hazem Nasef
- NOVA Southeastern University, Kiran Patel College of Allopathic Medicine, Fort Lauderdale, FL, USA
| | - Heli Patel
- NOVA Southeastern University, Kiran Patel College of Allopathic Medicine, Fort Lauderdale, FL, USA
| | - Quratulain Amin
- NOVA Southeastern University, Kiran Patel College of Allopathic Medicine, Fort Lauderdale, FL, USA
| | - Samuel Baum
- Louisiana State University Health Science Center, College of Medicine, New Orleans, LA, USA
| | | | - Darwin Ang
- Department of Surgery, Ocala Regional Medical Center, Ocala, FL, USA
| | - William S Havron
- Department of Surgical Education, Orlando Regional Medical Center, Orlando, FL, USA
- Department of Surgery, Division of Trauma and Surgical Critical Care, Orlando Regional Medical Center, Orlando, FL, USA
| | - Don Nakayama
- Mercer University School of Medicine, Columbus, GA, USA
| | - Adel Elkbuli
- Department of Surgical Education, Orlando Regional Medical Center, Orlando, FL, USA
- Department of Surgery, Division of Trauma and Surgical Critical Care, Orlando Regional Medical Center, Orlando, FL, USA
| |
Collapse
|
5
|
Desai P, Wang H, Davis L, Ullmann TM, DiBrito SR. Bias Perpetuates Bias: ChatGPT Learns Gender Inequities in Academic Surgery Promotions. JOURNAL OF SURGICAL EDUCATION 2024; 81:1553-1557. [PMID: 39232303 DOI: 10.1016/j.jsurg.2024.07.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2024] [Revised: 07/08/2024] [Accepted: 07/28/2024] [Indexed: 09/06/2024]
Abstract
OBJECTIVE Gender inequities persist in academic surgery with implicit bias impacting hiring and promotion at all levels. We hypothesized that creating letters of recommendation for both female and male candidates for academic promotion in surgery using an AI platform, ChatGPT, would elucidate the entrained gender biases already present in the promotion process. DESIGN Using ChatGPT, we generated 6 letters of recommendation for "a phenomenal surgeon applying for job promotion to associate professor position", specifying "female" or "male" before surgeon in the prompt. We compared 3 "female" letters to 3 "male" letters for differences in length, language, and tone. RESULTS The letters written for females averaged 298 words compared to 314 for males. Female letters more frequently referred to "compassion", "empathy", and "inclusivity"; whereas male letters referred to "respect", "reputation", and "skill". CONCLUSIONS These findings highlight the gender bias present in promotion letters generated by ChatGPT, reiterating existing literature regarding real letters of recommendation in academic surgery. Our study suggests that surgeons should use AI tools, such as ChatGPT, with caution when writing LORs for academic surgery faculty promotion.
Collapse
Affiliation(s)
- Pooja Desai
- Department of Surgery, Albany Medical College, Albany NY
| | - Hao Wang
- Department of Surgery, Albany Medical College, Albany NY
| | - Lindy Davis
- Department of Surgery, Albany Medical College, Albany NY
| | | | | |
Collapse
|
6
|
Kral J, Hradis M, Buzga M, Kunovsky L. Exploring the benefits and challenges of AI-driven large language models in gastroenterology: Think out of the box. Biomed Pap Med Fac Univ Palacky Olomouc Czech Repub 2024; 168:277-283. [PMID: 39234774 DOI: 10.5507/bp.2024.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2024] [Accepted: 08/16/2024] [Indexed: 09/06/2024] Open
Abstract
Artificial Intelligence (AI) has evolved significantly over the past decades, from its early concepts in the 1950s to the present era of deep learning and natural language processing. Advanced large language models (LLMs), such as Chatbot Generative Pre-Trained Transformer (ChatGPT) is trained to generate human-like text responses. This technology has the potential to revolutionize various aspects of gastroenterology, including diagnosis, treatment, education, and decision-making support. The benefits of using LLMs in gastroenterology could include accelerating diagnosis and treatment, providing personalized care, enhancing education and training, assisting in decision-making, and improving communication with patients. However, drawbacks and challenges such as limited AI capability, training on possibly biased data, data errors, security and privacy concerns, and implementation costs must be addressed to ensure the responsible and effective use of this technology. The future of LLMs in gastroenterology relies on the ability to process and analyse large amounts of data, identify patterns, and summarize information and thus assist physicians in creating personalized treatment plans. As AI advances, LLMs will become more accurate and efficient, allowing for faster diagnosis and treatment of gastroenterological conditions. Ensuring effective collaboration between AI developers, healthcare professionals, and regulatory bodies is essential for the responsible and effective use of this technology. By finding the right balance between AI and human expertise and addressing the limitations and risks associated with its use, LLMs can play an increasingly significant role in gastroenterology, contributing to better patient care and supporting doctors in their work.
Collapse
Affiliation(s)
- Jan Kral
- Department of Internal Medicine, University Hospital Motol and Second Faculty of Medicine, Charles University, Prague, Czech Republic
- Department of Hepatogastroenterology, Institute for Clinical and Experimental Medicine, Prague, Czech Republic
| | - Michal Hradis
- MAIA LABS s.r.o., Brno, Czech Republic
- Faculty of Information Technology, University of Technology, Brno, Czech Republic
| | - Marek Buzga
- Department of Physiology and Pathophysiology, Faculty of Medicine, University of Ostrava, Ostrava, Czech Republic
- Institute of Laboratory Medicine, University Hospital Ostrava, Ostrava, Czech Republic
| | - Lumir Kunovsky
- 2nd Department of Internal Medicine - Gastroenterology and Geriatrics, University Hospital Olomouc and Faculty of Medicine and Dentistry, Palacky University Olomouc, Olomouc, Czech Republic
- Department of Surgery, University Hospital Brno and Faculty of Medicine, Masaryk University, Brno, Czech Republic
- Department of Gastroenterology and Digestive Endoscopy, Masaryk Memorial Cancer Institute, Brno, Czech Republic
| |
Collapse
|
7
|
Gao Z, Ge J, Xu R, Chen X, Cai Z. Potential application of ChatGPT in Helicobacter pylori disease relevant queries. Front Med (Lausanne) 2024; 11:1489117. [PMID: 39464271 PMCID: PMC11503620 DOI: 10.3389/fmed.2024.1489117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2024] [Accepted: 09/27/2024] [Indexed: 10/29/2024] Open
Abstract
Background Advances in artificial intelligence are gradually transforming various fields, but its applicability among ordinary people is unknown. This study aims to explore the ability of a large language model to address Helicobacter pylori related questions. Methods We created several prompts on the basis of guidelines and the clinical concerns of patients. The capacity of ChatGPT on Helicobacter pylori queries was evaluated by experts. Ordinary people assessed the applicability. Results The responses to each prompt in ChatGPT-4 were good in terms of response length and repeatability. There was good agreement in each dimension (Fleiss' kappa ranged from 0.302 to 0.690, p < 0.05). The accuracy, completeness, usefulness, comprehension and satisfaction scores of the experts were generally high. Rated usefulness and comprehension among ordinary people were significantly lower than expert, while medical students gave a relatively positive evaluation. Conclusion ChatGPT-4 performs well in resolving Helicobacter pylori related questions. Large language models may become an excellent tool for medical students in the future, but still requires further research and validation.
Collapse
Affiliation(s)
| | | | | | - Xiaoyan Chen
- Department of Gastroenterology, Second Affiliated Hospital and Yuying Children’s Hospital of Wenzhou Medical University, Wenzhou, China
| | - Zhenzhai Cai
- Department of Gastroenterology, Second Affiliated Hospital and Yuying Children’s Hospital of Wenzhou Medical University, Wenzhou, China
| |
Collapse
|
8
|
Bektaş M, Pereira JK, Daams F, van der Peet DL. ChatGPT in surgery: a revolutionary innovation? Surg Today 2024; 54:964-971. [PMID: 38421439 PMCID: PMC11266448 DOI: 10.1007/s00595-024-02800-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 12/13/2023] [Indexed: 03/02/2024]
Abstract
ChatGPT has brought about a new era of digital health, as this model has become prominent and been rapidly developing since its release. ChatGPT may be able to facilitate improvements in surgery as well; however, the influence of ChatGPT on surgery is largely unknown at present. Therefore, the present study reports on the current applications of ChatGPT in the field of surgery, evaluating its workflow, practical implementations, limitations, and future perspectives. A literature search was performed using the PubMed and Embase databases. The initial search was performed from its inception until July 2023. This study revealed that ChatGPT has promising capabilities in areas of surgical research, education, training, and practice. In daily practice, surgeons and surgical residents can be aided in performing logistics and administrative tasks, and patients can be more efficiently informed about the details of their condition. However, priority should be given to establishing proper policies and protocols to ensure the safe and reliable use of this model.
Collapse
Affiliation(s)
- Mustafa Bektaş
- Amsterdam UMC Location Vrije Universiteit Amsterdam, Surgery, De Boelelaan 1117, Amsterdam, The Netherlands.
| | - Jaime Ken Pereira
- Department of Computer Science, Vrije Universiteit Amsterdam, De Boelelaan 1105, Amsterdam, The Netherlands
| | - Freek Daams
- Amsterdam UMC Location Vrije Universiteit Amsterdam, Surgery, De Boelelaan 1117, Amsterdam, The Netherlands
| | - Donald L van der Peet
- Amsterdam UMC Location Vrije Universiteit Amsterdam, Surgery, De Boelelaan 1117, Amsterdam, The Netherlands
| |
Collapse
|
9
|
Haltaufderheide J, Ranisch R. The ethics of ChatGPT in medicine and healthcare: a systematic review on Large Language Models (LLMs). NPJ Digit Med 2024; 7:183. [PMID: 38977771 PMCID: PMC11231310 DOI: 10.1038/s41746-024-01157-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Accepted: 05/29/2024] [Indexed: 07/10/2024] Open
Abstract
With the introduction of ChatGPT, Large Language Models (LLMs) have received enormous attention in healthcare. Despite potential benefits, researchers have underscored various ethical implications. While individual instances have garnered attention, a systematic and comprehensive overview of practical applications currently researched and ethical issues connected to them is lacking. Against this background, this work maps the ethical landscape surrounding the current deployment of LLMs in medicine and healthcare through a systematic review. Electronic databases and preprint servers were queried using a comprehensive search strategy which generated 796 records. Studies were screened and extracted following a modified rapid review approach. Methodological quality was assessed using a hybrid approach. For 53 records, a meta-aggregative synthesis was performed. Four general fields of applications emerged showcasing a dynamic exploration phase. Advantages of using LLMs are attributed to their capacity in data analysis, information provisioning, support in decision-making or mitigating information loss and enhancing information accessibility. However, our study also identifies recurrent ethical concerns connected to fairness, bias, non-maleficence, transparency, and privacy. A distinctive concern is the tendency to produce harmful or convincing but inaccurate content. Calls for ethical guidance and human oversight are recurrent. We suggest that the ethical guidance debate should be reframed to focus on defining what constitutes acceptable human oversight across the spectrum of applications. This involves considering the diversity of settings, varying potentials for harm, and different acceptable thresholds for performance and certainty in healthcare. Additionally, critical inquiry is needed to evaluate the necessity and justification of LLMs' current experimental use.
Collapse
Affiliation(s)
- Joschka Haltaufderheide
- Faculty of Health Sciences Brandenburg, University of Potsdam, Am Mühlenberg 9, Potsdam, 14476, Germany
| | - Robert Ranisch
- Faculty of Health Sciences Brandenburg, University of Potsdam, Am Mühlenberg 9, Potsdam, 14476, Germany.
| |
Collapse
|
10
|
Gomez-Cabello CA, Borna S, Pressman SM, Haider SA, Forte AJ. Large Language Models for Intraoperative Decision Support in Plastic Surgery: A Comparison between ChatGPT-4 and Gemini. MEDICINA (KAUNAS, LITHUANIA) 2024; 60:957. [PMID: 38929573 PMCID: PMC11205293 DOI: 10.3390/medicina60060957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/20/2024] [Revised: 06/06/2024] [Accepted: 06/07/2024] [Indexed: 06/28/2024]
Abstract
Background and Objectives: Large language models (LLMs) are emerging as valuable tools in plastic surgery, potentially reducing surgeons' cognitive loads and improving patients' outcomes. This study aimed to assess and compare the current state of the two most common and readily available LLMs, Open AI's ChatGPT-4 and Google's Gemini Pro (1.0 Pro), in providing intraoperative decision support in plastic and reconstructive surgery procedures. Materials and Methods: We presented each LLM with 32 independent intraoperative scenarios spanning 5 procedures. We utilized a 5-point and a 3-point Likert scale for medical accuracy and relevance, respectively. We determined the readability of the responses using the Flesch-Kincaid Grade Level (FKGL) and Flesch Reading Ease (FRE) score. Additionally, we measured the models' response time. We compared the performance using the Mann-Whitney U test and Student's t-test. Results: ChatGPT-4 significantly outperformed Gemini in providing accurate (3.59 ± 0.84 vs. 3.13 ± 0.83, p-value = 0.022) and relevant (2.28 ± 0.77 vs. 1.88 ± 0.83, p-value = 0.032) responses. Alternatively, Gemini provided more concise and readable responses, with an average FKGL (12.80 ± 1.56) significantly lower than ChatGPT-4's (15.00 ± 1.89) (p < 0.0001). However, there was no difference in the FRE scores (p = 0.174). Moreover, Gemini's average response time was significantly faster (8.15 ± 1.42 s) than ChatGPT'-4's (13.70 ± 2.87 s) (p < 0.0001). Conclusions: Although ChatGPT-4 provided more accurate and relevant responses, both models demonstrated potential as intraoperative tools. Nevertheless, their performance inconsistency across the different procedures underscores the need for further training and optimization to ensure their reliability as intraoperative decision-support tools.
Collapse
Affiliation(s)
- Cesar A. Gomez-Cabello
- Division of Plastic Surgery, Mayo Clinic, 4500 San Pablo Rd S, Jacksonville, FL 32224, USA
| | - Sahar Borna
- Division of Plastic Surgery, Mayo Clinic, 4500 San Pablo Rd S, Jacksonville, FL 32224, USA
| | - Sophia M. Pressman
- Division of Plastic Surgery, Mayo Clinic, 4500 San Pablo Rd S, Jacksonville, FL 32224, USA
| | - Syed Ali Haider
- Division of Plastic Surgery, Mayo Clinic, 4500 San Pablo Rd S, Jacksonville, FL 32224, USA
| | - Antonio J. Forte
- Division of Plastic Surgery, Mayo Clinic, 4500 San Pablo Rd S, Jacksonville, FL 32224, USA
- Center for Digital Health, Mayo Clinic, 200 First St. SW, Rochester, MN 55905, USA
| |
Collapse
|
11
|
Pressman SM, Borna S, Gomez-Cabello CA, Haider SA, Forte AJ. AI in Hand Surgery: Assessing Large Language Models in the Classification and Management of Hand Injuries. J Clin Med 2024; 13:2832. [PMID: 38792374 PMCID: PMC11122623 DOI: 10.3390/jcm13102832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2024] [Revised: 04/29/2024] [Accepted: 05/09/2024] [Indexed: 05/26/2024] Open
Abstract
Background: OpenAI's ChatGPT (San Francisco, CA, USA) and Google's Gemini (Mountain View, CA, USA) are two large language models that show promise in improving and expediting medical decision making in hand surgery. Evaluating the applications of these models within the field of hand surgery is warranted. This study aims to evaluate ChatGPT-4 and Gemini in classifying hand injuries and recommending treatment. Methods: Gemini and ChatGPT were given 68 fictionalized clinical vignettes of hand injuries twice. The models were asked to use a specific classification system and recommend surgical or nonsurgical treatment. Classifications were scored based on correctness. Results were analyzed using descriptive statistics, a paired two-tailed t-test, and sensitivity testing. Results: Gemini, correctly classifying 70.6% hand injuries, demonstrated superior classification ability over ChatGPT (mean score 1.46 vs. 0.87, p-value < 0.001). For management, ChatGPT demonstrated higher sensitivity in recommending surgical intervention compared to Gemini (98.0% vs. 88.8%), but lower specificity (68.4% vs. 94.7%). When compared to ChatGPT, Gemini demonstrated greater response replicability. Conclusions: Large language models like ChatGPT and Gemini show promise in assisting medical decision making, particularly in hand surgery, with Gemini generally outperforming ChatGPT. These findings emphasize the importance of considering the strengths and limitations of different models when integrating them into clinical practice.
Collapse
Affiliation(s)
| | - Sahar Borna
- Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
| | | | - Syed Ali Haider
- Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
| | - Antonio Jorge Forte
- Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
- Center for Digital Health, Mayo Clinic, Rochester, MN 55905, USA
| |
Collapse
|
12
|
Pressman SM, Borna S, Gomez-Cabello CA, Haider SA, Haider C, Forte AJ. AI and Ethics: A Systematic Review of the Ethical Considerations of Large Language Model Use in Surgery Research. Healthcare (Basel) 2024; 12:825. [PMID: 38667587 PMCID: PMC11050155 DOI: 10.3390/healthcare12080825] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 04/02/2024] [Accepted: 04/09/2024] [Indexed: 04/28/2024] Open
Abstract
INTRODUCTION As large language models receive greater attention in medical research, the investigation of ethical considerations is warranted. This review aims to explore surgery literature to identify ethical concerns surrounding these artificial intelligence models and evaluate how autonomy, beneficence, nonmaleficence, and justice are represented within these ethical discussions to provide insights in order to guide further research and practice. METHODS A systematic review was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. Five electronic databases were searched in October 2023. Eligible studies included surgery-related articles that focused on large language models and contained adequate ethical discussion. Study details, including specialty and ethical concerns, were collected. RESULTS The literature search yielded 1179 articles, with 53 meeting the inclusion criteria. Plastic surgery, orthopedic surgery, and neurosurgery were the most represented surgical specialties. Autonomy was the most explicitly cited ethical principle. The most frequently discussed ethical concern was accuracy (n = 45, 84.9%), followed by bias, patient confidentiality, and responsibility. CONCLUSION The ethical implications of using large language models in surgery are complex and evolving. The integration of these models into surgery necessitates continuous ethical discourse to ensure responsible and ethical use, balancing technological advancement with human dignity and safety.
Collapse
Affiliation(s)
| | - Sahar Borna
- Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
| | | | - Syed A. Haider
- Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
| | - Clifton Haider
- Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN 55905, USA
| | - Antonio J. Forte
- Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
- Center for Digital Health, Mayo Clinic, Rochester, MN 55905, USA
| |
Collapse
|
13
|
Shorey S, Mattar C, Pereira TLB, Choolani M. A scoping review of ChatGPT's role in healthcare education and research. NURSE EDUCATION TODAY 2024; 135:106121. [PMID: 38340639 DOI: 10.1016/j.nedt.2024.106121] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 01/05/2024] [Accepted: 02/04/2024] [Indexed: 02/12/2024]
Abstract
OBJECTIVES To examine and consolidate literature regarding the advantages and disadvantages of utilizing ChatGPT in healthcare education and research. DESIGN/METHODS We searched seven electronic databases (PubMed/Medline, CINAHL, Embase, PsycINFO, Scopus, ProQuest Dissertations and Theses Global, and Web of Science) from November 2022 until September 2023. This scoping review adhered to Arksey and O'Malley's framework and followed reporting guidelines outlined in the PRISMA-ScR checklist. For analysis, we employed Thomas and Harden's thematic synthesis framework. RESULTS A total of 100 studies were included. An overarching theme, "Forging the Future: Bridging Theory and Integration of ChatGPT" emerged, accompanied by two main themes (1) Enhancing Healthcare Education, Research, and Writing with ChatGPT, (2) Controversies and Concerns about ChatGPT in Healthcare Education Research and Writing, and seven subthemes. CONCLUSIONS Our review underscores the importance of acknowledging legitimate concerns related to the potential misuse of ChatGPT such as 'ChatGPT hallucinations', its limited understanding of specialized healthcare knowledge, its impact on teaching methods and assessments, confidentiality and security risks, and the controversial practice of crediting it as a co-author on scientific papers, among other considerations. Furthermore, our review also recognizes the urgency of establishing timely guidelines and regulations, along with the active engagement of relevant stakeholders, to ensure the responsible and safe implementation of ChatGPT's capabilities. We advocate for the use of cross-verification techniques to enhance the precision and reliability of generated content, the adaptation of higher education curricula to incorporate ChatGPT's potential, educators' need to familiarize themselves with the technology to improve their literacy and teaching approaches, and the development of innovative methods to detect ChatGPT usage. Furthermore, data protection measures should be prioritized when employing ChatGPT, and transparent reporting becomes crucial when integrating ChatGPT into academic writing.
Collapse
Affiliation(s)
- Shefaly Shorey
- Alice Lee Centre for Nursing Studies, Yong Loo Lin School of Medicine, National University of Singapore, Singapore.
| | - Citra Mattar
- Division of Maternal Fetal Medicine, Department of Obstetrics and Gynaecology, National University Health Systems, Singapore; Department of Obstetrics and Gynaecology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Travis Lanz-Brian Pereira
- Alice Lee Centre for Nursing Studies, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Mahesh Choolani
- Division of Maternal Fetal Medicine, Department of Obstetrics and Gynaecology, National University Health Systems, Singapore; Department of Obstetrics and Gynaecology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| |
Collapse
|
14
|
Li W, Chen J, Chen F, Liang J, Yu H. Exploring the Potential of ChatGPT-4 in Responding to Common Questions About Abdominoplasty: An AI-Based Case Study of a Plastic Surgery Consultation. Aesthetic Plast Surg 2024; 48:1571-1583. [PMID: 37770637 DOI: 10.1007/s00266-023-03660-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 09/06/2023] [Indexed: 09/30/2023]
Abstract
BACKGROUND With the increasing integration of artificial intelligence (AI) in health care, AI chatbots like ChatGPT-4 are being used to deliver health information. OBJECTIVES This study aimed to assess the capability of ChatGPT-4 in answering common questions related to abdominoplasty, evaluating its potential as an adjunctive tool in patient education and preoperative consultation. METHODS A variety of common questions about abdominoplasty were submitted to ChatGPT-4. These questions were sourced from a question list provided by the American Society of Plastic Surgery to ensure their relevance and comprehensiveness. An experienced plastic surgeon meticulously evaluated the responses generated by ChatGPT-4 in terms of informational depth, response articulation, and competency to determine the proficiency of the AI in providing patient-centered information. RESULTS The study showed that ChatGPT-4 can give clear answers, making it useful for answering common queries. However, it struggled with personalized advice and sometimes provided incorrect or outdated references. Overall, ChatGPT-4 can effectively share abdominoplasty information, which may help patients better understand the procedure. Despite these positive findings, the AI needs more refinement, especially in providing personalized and accurate information, to fully meet patient education needs in plastic surgery. CONCLUSIONS Although ChatGPT-4 shows promise as a resource for patient education, continuous improvements and rigorous checks are essential for its beneficial integration into healthcare settings. The study emphasizes the need for further research, particularly focused on improving the personalization and accuracy of AI responses. LEVEL OF EVIDENCE V This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266 .
Collapse
Affiliation(s)
- Wenbo Li
- Department of Nursing, Jinzhou Medical University, Jinzhou, 121001, China
| | - Junjiang Chen
- Department of Burn Plastic and Medical Aesthetic Surgery, The First Affiliated Hospital, Jinzhou Medical University, Jinzhou, China
| | - Fengmin Chen
- Department of Colorectal Surgery, The First Affiliated Hospital, Jinzhou Medical University, Jinzhou, China
| | - Jiaqing Liang
- Department of Nursing, Jinzhou Medical University, Jinzhou, 121001, China
| | - Hongyu Yu
- Department of Nursing, Jinzhou Medical University, Jinzhou, 121001, China.
| |
Collapse
|
15
|
Atarere J, Naqvi H, Haas C, Adewunmi C, Bandaru S, Allamneni R, Ugonabo O, Egbo O, Umoren M, Kanth P. Applicability of Online Chat-Based Artificial Intelligence Models to Colorectal Cancer Screening. Dig Dis Sci 2024; 69:791-797. [PMID: 38267726 DOI: 10.1007/s10620-024-08274-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/21/2023] [Accepted: 01/02/2024] [Indexed: 01/26/2024]
Abstract
BACKGROUND Over the past year, studies have shown potential in the applicability of ChatGPT in various medical specialties including cardiology and oncology. However, the application of ChatGPT and other online chat-based AI models to patient education and patient-physician communication on colorectal cancer screening has not been critically evaluated which is what we aimed to do in this study. METHODS We posed 15 questions on important colorectal cancer screening concepts and 5 common questions asked by patients to the 3 most commonly used freely available artificial intelligence (AI) models. The responses provided by the AI models were graded for appropriateness and reliability using American College of Gastroenterology guidelines. The responses to each question provided by an AI model were graded as reliably appropriate (RA), reliably inappropriate (RI) and unreliable. Grader assessments were validated by the joint probability of agreement for two raters. RESULTS ChatGPT and YouChat™ provided RA responses to the questions posed more often than BingChat. There were two questions that > 1 AI model provided unreliable responses to. ChatGPT did not provide references. BingChat misinterpreted some of the information it referenced. The age of CRC screening provided by YouChat™ was not consistently up-to-date. Inter-rater reliability for 2 raters was 89.2%. CONCLUSION Most responses provided by AI models on CRC screening were appropriate. Some limitations exist in their ability to correctly interpret medical literature and provide updated information in answering queries. Patients should consult their physicians for context on the recommendations made by these AI models.
Collapse
Affiliation(s)
- Joseph Atarere
- Department of Medicine, MedStar Health, 201 East University Pkwy, Baltimore, MD, 21218, USA.
- Department of Biostatistics and Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
| | - Haider Naqvi
- Department of Medicine, MedStar Health, 201 East University Pkwy, Baltimore, MD, 21218, USA
| | - Christopher Haas
- Department of Medicine, MedStar Health, 201 East University Pkwy, Baltimore, MD, 21218, USA
| | - Comfort Adewunmi
- Division of Geriatrics and Gerontology, Emory University School of Medicine, Atlanta, GA, USA
| | - Sumanth Bandaru
- Department of Medicine, MedStar Health, 201 East University Pkwy, Baltimore, MD, 21218, USA
| | - Rakesh Allamneni
- Department of Medicine, MedStar Health, 201 East University Pkwy, Baltimore, MD, 21218, USA
| | - Onyinye Ugonabo
- Department of Medicine, Marshall University Joan C. Edwards School of Medicine, Huntington, WV, USA
| | - Olachi Egbo
- Department of Medicine, Aurora Medical Center, Oshkosh, WI, USA
| | - Mfoniso Umoren
- Division of Gastroenterology, Georgetown University Hospital, Washington, DC, USA
| | - Priyanka Kanth
- Division of Gastroenterology, Georgetown University Hospital, Washington, DC, USA
| |
Collapse
|
16
|
Madrid-García A, Rosales-Rosado Z, Freites-Nuñez D, Pérez-Sancristóbal I, Pato-Cour E, Plasencia-Rodríguez C, Cabeza-Osorio L, Abasolo-Alcázar L, León-Mateos L, Fernández-Gutiérrez B, Rodríguez-Rodríguez L. Harnessing ChatGPT and GPT-4 for evaluating the rheumatology questions of the Spanish access exam to specialized medical training. Sci Rep 2023; 13:22129. [PMID: 38092821 PMCID: PMC10719375 DOI: 10.1038/s41598-023-49483-6] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 12/08/2023] [Indexed: 12/17/2023] Open
Abstract
The emergence of large language models (LLM) with remarkable performance such as ChatGPT and GPT-4, has led to an unprecedented uptake in the population. One of their most promising and studied applications concerns education due to their ability to understand and generate human-like text, creating a multitude of opportunities for enhancing educational practices and outcomes. The objective of this study is twofold: to assess the accuracy of ChatGPT/GPT-4 in answering rheumatology questions from the access exam to specialized medical training in Spain (MIR), and to evaluate the medical reasoning followed by these LLM to answer those questions. A dataset, RheumaMIR, of 145 rheumatology-related questions, extracted from the exams held between 2010 and 2023, was created for that purpose, used as a prompt for the LLM, and was publicly distributed. Six rheumatologists with clinical and teaching experience evaluated the clinical reasoning of the chatbots using a 5-point Likert scale and their degree of agreement was analyzed. The association between variables that could influence the models' accuracy (i.e., year of the exam question, disease addressed, type of question and genre) was studied. ChatGPT demonstrated a high level of performance in both accuracy, 66.43%, and clinical reasoning, median (Q1-Q3), 4.5 (2.33-4.67). However, GPT-4 showed better performance with an accuracy score of 93.71% and a median clinical reasoning value of 4.67 (4.5-4.83). These findings suggest that LLM may serve as valuable tools in rheumatology education, aiding in exam preparation and supplementing traditional teaching methods.
Collapse
Affiliation(s)
- Alfredo Madrid-García
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain.
| | - Zulema Rosales-Rosado
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
| | - Dalifer Freites-Nuñez
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
| | - Inés Pérez-Sancristóbal
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
| | - Esperanza Pato-Cour
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
| | | | - Luis Cabeza-Osorio
- Medicina Interna, Hospital Universitario del Henares, Avenida de Marie Curie, 0, 28822, Madrid, Spain
- Facultad de Medicina, Universidad Francisco de Vitoria, Carretera Pozuelo, Km 1800, 28223, Madrid, Spain
| | - Lydia Abasolo-Alcázar
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
| | - Leticia León-Mateos
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
| | - Benjamín Fernández-Gutiérrez
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
- Facultad de Medicina, Universidad Complutense de Madrid, Madrid, Spain
| | - Luis Rodríguez-Rodríguez
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
| |
Collapse
|
17
|
El Haj M, Boutoleau-Bretonnière C, Chapelet G. ChatGPT's dance with neuropsychological data: A case study in Alzheimer's disease. Ageing Res Rev 2023; 92:102117. [PMID: 37926396 DOI: 10.1016/j.arr.2023.102117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 10/31/2023] [Accepted: 11/01/2023] [Indexed: 11/07/2023]
Abstract
Artificial intelligence continues to revolutionize the medical and scientific field, especially with the release of ChatGPT. We assessed whether it provides an accurate interpretation of neuropsychological screening. We provided ChatGPT with the neuropsychological data of a patient with mild Alzheimer's Disease and invited it and two neuropsychologists to interpret the data. While ChatGPT provided an accurate interpretation of scores on each of the neuropsychological tests, it did not use standardized scores and did not specify the cognitive domain that may be most impaired. In contrast, the neuropsychologists used standardized scores to determine that the patient was mainly suffering from memory decline. While ChatGPT may succeed in the general interpretation of neuropsychological testing, at least in patients with Alzheimer's Disease, it still cannot create a pattern of scores across different tests to better specify the nature of cognitive impairment.
Collapse
Affiliation(s)
- Mohamad El Haj
- Institut Universitaire de France, Paris, France; CHU Nantes, Clinical Gerontology Department, Bd Jacques Monod, F44093 Nantes, France.
| | | | - Guillaume Chapelet
- CHU Nantes, Clinical Gerontology Department, Bd Jacques Monod, F44093 Nantes, France; Université de Nantes, Inserm, TENS, The Enteric Nervous System in Gut and Brain Diseases, IMAD, Nantes, France
| |
Collapse
|
18
|
Chakraborty C, Pal S, Bhattacharya M, Dash S, Lee SS. Overview of Chatbots with special emphasis on artificial intelligence-enabled ChatGPT in medical science. Front Artif Intell 2023; 6:1237704. [PMID: 38028668 PMCID: PMC10644239 DOI: 10.3389/frai.2023.1237704] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 10/05/2023] [Indexed: 12/01/2023] Open
Abstract
The release of ChatGPT has initiated new thinking about AI-based Chatbot and its application and has drawn huge public attention worldwide. Researchers and doctors have started thinking about the promise and application of AI-related large language models in medicine during the past few months. Here, the comprehensive review highlighted the overview of Chatbot and ChatGPT and their current role in medicine. Firstly, the general idea of Chatbots, their evolution, architecture, and medical use are discussed. Secondly, ChatGPT is discussed with special emphasis of its application in medicine, architecture and training methods, medical diagnosis and treatment, research ethical issues, and a comparison of ChatGPT with other NLP models are illustrated. The article also discussed the limitations and prospects of ChatGPT. In the future, these large language models and ChatGPT will have immense promise in healthcare. However, more research is needed in this direction.
Collapse
Affiliation(s)
- Chiranjib Chakraborty
- Department of Biotechnology, School of Life Science and Biotechnology, Adamas University, Kolkata, West Bengal, India
| | - Soumen Pal
- School of Mechanical Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| | | | - Snehasish Dash
- School of Mechanical Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| | - Sang-Soo Lee
- Institute for Skeletal Aging and Orthopedic Surgery, Hallym University Chuncheon Sacred Heart Hospital, Chuncheon-si, Gangwon-do, Republic of Korea
| |
Collapse
|
19
|
Praveen SV, Vajrobol V. Can ChatGPT be Trusted for Consulting? Uncovering Doctor's Perceptions Using Deep Learning Techniques. Ann Biomed Eng 2023; 51:2116-2119. [PMID: 37208451 DOI: 10.1007/s10439-023-03245-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 05/15/2023] [Indexed: 05/21/2023]
Abstract
Since the introduction of ChatGPT by OpenAI in late 2022, the question of whether doctors can employ it for consultation has been a subject of debate. ChatGPT is a deep learning model trained on a vast dataset, but concerns about the reliability of its output have been a subject of debate in recent times. In this article, we have employed cutting-edge bidirectional encoder representations from transformers (BERT) sentiment analysis and topic modeling techniques to comprehend doctors' attitudes toward using ChatGPT in consultation.
Collapse
Affiliation(s)
- S V Praveen
- Department of Analytics, Xavier Institute of Management and Entrepreneurship, Bangalore, India.
| | - Vajratiya Vajrobol
- Institute of Informatics and Communication, University of Delhi-South Campus, New Delhi, India
| |
Collapse
|
20
|
Ray PP. Revisiting the need for the use of GPT in surgery and medicine. Tech Coloproctol 2023; 27:959-960. [PMID: 37498419 DOI: 10.1007/s10151-023-02847-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Accepted: 07/22/2023] [Indexed: 07/28/2023]
Affiliation(s)
- P P Ray
- Sikkim University, Gangtok, India.
| |
Collapse
|
21
|
Shaffrey EC, Eftekari SC, Wilke LG, Poore SO. Surgeon or Bot? The Risks of Using Artificial Intelligence in Surgical Journal Publications. ANNALS OF SURGERY OPEN 2023; 4:e309. [PMID: 37746615 PMCID: PMC10513298 DOI: 10.1097/as9.0000000000000309] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 06/01/2023] [Indexed: 09/26/2023] Open
Abstract
Mini-Abstract ChatGPT is an artificial intelligence (AI) technology that has begun to transform academics through its ability to create human-like text. This has raised ethical concerns about its assistance in writing scientific literature. Our aim is to highlight the benefits and risks that this technology may pose to the surgical field.
Collapse
Affiliation(s)
- Ellen C. Shaffrey
- From the Division of Plastic Surgery, University of Wisconsin School of Medicine and Public Health, Madison, WI
| | - Sahand C. Eftekari
- From the Division of Plastic Surgery, University of Wisconsin School of Medicine and Public Health, Madison, WI
| | - Lee G. Wilke
- Department of Surgery, University of Wisconsin School of Medicine and Public Health, Madison, WI
| | - Samuel O. Poore
- From the Division of Plastic Surgery, University of Wisconsin School of Medicine and Public Health, Madison, WI
| |
Collapse
|
22
|
Allahqoli L, Ghiasvand MM, Mazidimoradi A, Salehiniya H, Alkatout I. Diagnostic and Management Performance of ChatGPT in Obstetrics and Gynecology. Gynecol Obstet Invest 2023; 88:310-313. [PMID: 37494894 DOI: 10.1159/000533177] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 07/20/2023] [Indexed: 07/28/2023]
Abstract
OBJECTIVES The use of artificial intelligence (AI) in clinical patient management and medical education has been advancing over time. ChatGPT was developed and trained recently, using a large quantity of textual data from the internet. Medical science is expected to be transformed by its use. The present study was conducted to evaluate the diagnostic and management performance of the ChatGPT AI model in obstetrics and gynecology. DESIGN A cross-sectional study was conducted. PARTICIPANTS/MATERIALS, SETTING, METHODS This study was conducted in Iran in March 2023. Medical histories and examination results of 30 cases were determined in six areas of obstetrics and gynecology. The cases were presented to a gynecologist and ChatGPT for diagnosis and management. Answers from the gynecologist and ChatGPT were compared, and the diagnostic and management performance of ChatGPT were determined. RESULTS Ninety percent (27 of 30) of the cases in obstetrics and gynecology were correctly handled by ChatGPT. Its responses were eloquent, informed, and free of a significant number of errors or misinformation. Even when the answers provided by ChatGPT were incorrect, the responses contained a logical explanation about the case as well as information provided in the question stem. LIMITATIONS The data used in this study were taken from the electronic book and may reflect bias in the diagnosis of ChatGPT. CONCLUSIONS This is the first evaluation of ChatGPT's performance in diagnosis and management in the field of obstetrics and gynecology. It appears that ChatGPT has potential applications in the practice of medicine and is (currently) free and simple to use. However, several ethical considerations and limitations such as bias, validity, copyright infringement, and plagiarism need to be addressed in future studies.
Collapse
Affiliation(s)
- Leila Allahqoli
- Midwifery Department, Ministry of Health and Medical Education, Tehran, Iran,
| | | | - Afrooz Mazidimoradi
- Student Research Committee, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Hamid Salehiniya
- Social Determinants of Health Research Center, Birjand University of Medical Sciences, Birjand, Iran
| | - Ibrahim Alkatout
- Campus Kiel, Kiel School of Gynaecological Endoscopy, University Hospitals Schleswig-Holstein, Kiel, Germany
| |
Collapse
|
23
|
Zhang B, Shi H, Wang H. Machine Learning and AI in Cancer Prognosis, Prediction, and Treatment Selection: A Critical Approach. J Multidiscip Healthc 2023; 16:1779-1791. [PMID: 37398894 PMCID: PMC10312208 DOI: 10.2147/jmdh.s410301] [Citation(s) in RCA: 65] [Impact Index Per Article: 32.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 06/12/2023] [Indexed: 07/04/2023] Open
Abstract
Cancer is a leading cause of morbidity and mortality worldwide. While progress has been made in the diagnosis, prognosis, and treatment of cancer patients, individualized and data-driven care remains a challenge. Artificial intelligence (AI), which is used to predict and automate many cancers, has emerged as a promising option for improving healthcare accuracy and patient outcomes. AI applications in oncology include risk assessment, early diagnosis, patient prognosis estimation, and treatment selection based on deep knowledge. Machine learning (ML), a subset of AI that enables computers to learn from training data, has been highly effective at predicting various types of cancer, including breast, brain, lung, liver, and prostate cancer. In fact, AI and ML have demonstrated greater accuracy in predicting cancer than clinicians. These technologies also have the potential to improve the diagnosis, prognosis, and quality of life of patients with various illnesses, not just cancer. Therefore, it is important to improve current AI and ML technologies and to develop new programs to benefit patients. This article examines the use of AI and ML algorithms in cancer prediction, including their current applications, limitations, and future prospects.
Collapse
Affiliation(s)
- Bo Zhang
- Jinling Institute of Science and Technology, Nanjing City, Jiangsu Province, People’s Republic of China
| | - Huiping Shi
- Jinling Institute of Science and Technology, Nanjing City, Jiangsu Province, People’s Republic of China
| | - Hongtao Wang
- School of Life Science, Tonghua Normal University, Tonghua City, Jilin Province, People’s Republic of China
| |
Collapse
|