51
|
Hassona Y. Applications of artificial intelligence in special care dentistry. SPECIAL CARE IN DENTISTRY 2024; 44:952-953. [PMID: 37532677 DOI: 10.1111/scd.12911] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Revised: 07/19/2023] [Accepted: 07/23/2023] [Indexed: 08/04/2023]
Affiliation(s)
- Yazan Hassona
- School of Dentistry, The University of Jordan, Amman, Jordan
- School of Dentistry, Al Ahliyya Amman University, Amman, Jordan
| |
Collapse
|
52
|
Han Z, Battaglia F, Udaiyar A, Fooks A, Terlecky SR. An explorative assessment of ChatGPT as an aid in medical education: Use it with caution. MEDICAL TEACHER 2024; 46:657-664. [PMID: 37862566 DOI: 10.1080/0142159x.2023.2271159] [Citation(s) in RCA: 22] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/22/2023]
Abstract
OBJECTIVE To explore the use of ChatGPT by educators and students in a medical school setting. METHOD This study used the public version of ChatGPT launched by OpenAI on November 30, 2022 (https://openai.com/blog/chatgpt/). We employed prompts to ask ChatGPT to 1) generate a content outline for a session on the topics of cholesterol, lipoproteins, and hyperlipidemia for medical students; 2) produce a list of learning objectives for the session; and 3) write assessment questions with and without clinical vignettes related to the identified learning objectives. We assessed the responses by ChatGPT for accuracy and reliability to determine the potential of the chatbot as an aid to educators and as a "know-it-all" medical information provider for students. RESULTS ChatGPT can function as an aid to educators, but it is not yet suitable as a reliable information resource for educators and medical students. CONCLUSION ChatGPT can be a useful tool to assist medical educators in drafting course and session content outlines and create assessment questions. At the same time, caution must be taken as ChatGPT is prone to providing incorrect information; expert oversight and caution are necessary to ensure the information generated is accurate and beneficial to students. Therefore, it is premature for medical students to use the current version of ChatGPT as a "know-it-all" information provider. In the future, medical educators should work with programming experts to explore and grow the full potential of AI in medical education.
Collapse
Affiliation(s)
- Zhiyong Han
- Department of Medical Sciences, Hackensack Meridian School of Medicine, Nutley, NJ, USA
| | - Fortunato Battaglia
- Department of Medical Sciences, Hackensack Meridian School of Medicine, Nutley, NJ, USA
| | - Abinav Udaiyar
- Department of Medical Sciences, Hackensack Meridian School of Medicine, Nutley, NJ, USA
| | - Allen Fooks
- Department of Medical Sciences, Hackensack Meridian School of Medicine, Nutley, NJ, USA
| | - Stanley R Terlecky
- Department of Medical Sciences, Hackensack Meridian School of Medicine, Nutley, NJ, USA
| |
Collapse
|
53
|
Niko MM, Karbasi Z, Kazemi M, Zahmatkeshan M. Comparing ChatGPT and Bing, in response to the Home Blood Pressure Monitoring (HBPM) knowledge checklist. Hypertens Res 2024; 47:1401-1409. [PMID: 38438722 DOI: 10.1038/s41440-024-01624-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 01/23/2024] [Accepted: 01/27/2024] [Indexed: 03/06/2024]
Abstract
High blood pressure is one of the major public health problems that is prevalent worldwide. Due to the rapid increase in the number of users of artificial intelligence tools such as ChatGPT and Bing, it is expected that patients will use these tools as a source of information to obtain information about high blood pressure. The purpose of this study is to check the accuracy, completeness, and reproducibility of answers provided by ChatGPT and Bing to the knowledge questionnaire of blood pressure control at home. In this study, ChatGPT and Bing's responses to the HBPM 10-question knowledge checklist on blood pressure measurement were independently reviewed by three cardiologists. The mean accuracy rating of ChatGPT was 5.96 (SD = 0.17) indicating the responses were highly accurate overall, with the vast majority receiving the top score. The mean accuracy and completeness of ChatGPT were 5.96 (SD = 0.17) and 2.93 (SD = 0.25) and in Bing were 5.31 (SD = 0.67), and 2.13 (SD = 0.53) Respectively. Due to the expansion of artificial intelligence applications, patients can use new tools such as ChatGPT and Bing to search for information and at the same time can trust the information obtained. we found that the answers obtained from ChatGPT are reliable and valuable for patients, while Bing is also considered a powerful tool, it has more limitations than ChatGPT, and the answers should be interpreted with caution.
Collapse
Affiliation(s)
| | - Zahra Karbasi
- Department of Health Information Sciences, Faculty of Management and Medical Information Sciences, Kerman University of Medical Sciences, Kerman, Iran
| | - Maryam Kazemi
- Noncommunicable Diseases Research Center, Fasa University of Medical Sciences, Fasa, Iran
| | - Maryam Zahmatkeshan
- Noncommunicable Diseases Research Center, Fasa University of Medical Sciences, Fasa, Iran.
- School of Allied Medical Sciences, Fasa University of Medical Sciences, Fasa, Iran.
| |
Collapse
|
54
|
Simms RC. Work With ChatGPT, Not Against: 3 Teaching Strategies That Harness the Power of Artificial Intelligence. Nurse Educ 2024; 49:158-161. [PMID: 38502607 DOI: 10.1097/nne.0000000000001634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/21/2024]
Abstract
BACKGROUND Technological advances have expanded nursing education to include generative artificial intelligence (AI) tools such as ChatGPT. PROBLEM Generative AI tools challenge academic integrity, pose a challenge to validating information accuracy, and require strategies to ensure the credibility of AI-generated information. APPROACH This article presents a dual-purpose approach integrating AI tools into prelicensure nursing education to enhance learning while promoting critical evaluation skills. Constructivist theories and Vygotsky's Zone of Proximal Development framework support this integration, with AI as a scaffold for developing critical thinking. OUTCOMES The approach involves practical activities for students to engage with AI-generated content critically, thereby reinforcing clinical judgment and preparing them for AI-prevalent health care environments. CONCLUSIONS Incorporating AI tools such as ChatGPT into nursing curricula represents a strategic educational advancement, equipping students with essential skills to navigate modern health care.
Collapse
Affiliation(s)
- Rachel Cox Simms
- Author Affiliation: Assistant Professor, School of Nursing, MGH Institute of Health Professions, Boston, Massachusetts
| |
Collapse
|
55
|
Carobene A, Padoan A, Cabitza F, Banfi G, Plebani M. Rising adoption of artificial intelligence in scientific publishing: evaluating the role, risks, and ethical implications in paper drafting and review process. Clin Chem Lab Med 2024; 62:835-843. [PMID: 38019961 DOI: 10.1515/cclm-2023-1136] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Accepted: 11/13/2023] [Indexed: 12/01/2023]
Abstract
BACKGROUND In the rapid evolving landscape of artificial intelligence (AI), scientific publishing is experiencing significant transformations. AI tools, while offering unparalleled efficiencies in paper drafting and peer review, also introduce notable ethical concerns. CONTENT This study delineates AI's dual role in scientific publishing: as a co-creator in the writing and review of scientific papers and as an ethical challenge. We first explore the potential of AI as an enhancer of efficiency, efficacy, and quality in creating scientific papers. A critical assessment follows, evaluating the risks vs. rewards for researchers, especially those early in their careers, emphasizing the need to maintain a balance between AI's capabilities and fostering independent reasoning and creativity. Subsequently, we delve into the ethical dilemmas of AI's involvement, particularly concerning originality, plagiarism, and preserving the genuine essence of scientific discourse. The evolving dynamics further highlight an overlooked aspect: the inadequate recognition of human reviewers in the academic community. With the increasing volume of scientific literature, tangible metrics and incentives for reviewers are proposed as essential to ensure a balanced academic environment. SUMMARY AI's incorporation in scientific publishing is promising yet comes with significant ethical and operational challenges. The role of human reviewers is accentuated, ensuring authenticity in an AI-influenced environment. OUTLOOK As the scientific community treads the path of AI integration, a balanced symbiosis between AI's efficiency and human discernment is pivotal. Emphasizing human expertise, while exploit artificial intelligence responsibly, will determine the trajectory of an ethically sound and efficient AI-augmented future in scientific publishing.
Collapse
Affiliation(s)
- Anna Carobene
- Laboratory Medicine, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Andrea Padoan
- Department of Medicine-DIMED, University of Padova, Padova, Italy
- Laboratory Medicine Unit, University Hospital of Padova, Padova, Italy
| | - Federico Cabitza
- DISCo, Università Degli Studi di Milano-Bicocca, Milan, Italy
- IRCCS Ospedale Galeazzi - Sant'Ambrogio, Milan, Italy
| | - Giuseppe Banfi
- IRCCS Ospedale Galeazzi - Sant'Ambrogio, Milan, Italy
- University Vita-Salute San Raffaele, Milan, Italy
| | - Mario Plebani
- Laboratory Medicine Unit, University Hospital of Padova, Padova, Italy
- University of Padova, Padova, Italy
| |
Collapse
|
56
|
Weichelt BP, Pilz M, Burke R, Puthoff D, Namkoong K. The Potential of AI and ChatGPT in Improving Agricultural Injury and Illness Surveillance Programming and Dissemination. J Agromedicine 2024; 29:150-154. [PMID: 38050835 DOI: 10.1080/1059924x.2023.2284959] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Generative Artificial Intelligence (AI) provides unprecedented opportunities to improve injury surveillance systems in many ways, including the curation and publication of information related to agricultural injuries and illnesses. This editorial explores the feasibility and implication of ChatGPT integration in an international sentinel agricultural injury surveillance system, AgInjuryNews, highlighting that AI integration may enhance workflows by reducing human and financial resources and increasing outputs. In the coming years, text intensive natural language reports in AgInjuryNews and similar systems could be a rich source for data for ChatGPT or other more customized and fine-tuned LLMs. By harnessing the capabilities of AI and NLP, teams could potentially streamline the process of data analysis, report generation, and public dissemination, ultimately contributing to improved agricultural injury prevention efforts, well beyond any manually driven efforts.
Collapse
Affiliation(s)
- Bryan P Weichelt
- National Children's Center for Rural and Agricultural Health and Safety; National Farm Medicine Center, Marshfield Clinic Research Institute, Marshfield, WI, USA
| | - Matthew Pilz
- National Children's Center for Rural and Agricultural Health and Safety; National Farm Medicine Center, Marshfield Clinic Research Institute, Marshfield, WI, USA
| | - Richard Burke
- National Children's Center for Rural and Agricultural Health and Safety; National Farm Medicine Center, Marshfield Clinic Research Institute, Marshfield, WI, USA
| | - David Puthoff
- Office of Research and Sponsored Programs, Marshfield Clinic Research Institute, Marshfield, WI, USA
| | - Kang Namkoong
- Department of Communication, University of Maryland, College Park, MD, USA
| |
Collapse
|
57
|
Shorey S, Mattar C, Pereira TLB, Choolani M. A scoping review of ChatGPT's role in healthcare education and research. NURSE EDUCATION TODAY 2024; 135:106121. [PMID: 38340639 DOI: 10.1016/j.nedt.2024.106121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 01/05/2024] [Accepted: 02/04/2024] [Indexed: 02/12/2024]
Abstract
OBJECTIVES To examine and consolidate literature regarding the advantages and disadvantages of utilizing ChatGPT in healthcare education and research. DESIGN/METHODS We searched seven electronic databases (PubMed/Medline, CINAHL, Embase, PsycINFO, Scopus, ProQuest Dissertations and Theses Global, and Web of Science) from November 2022 until September 2023. This scoping review adhered to Arksey and O'Malley's framework and followed reporting guidelines outlined in the PRISMA-ScR checklist. For analysis, we employed Thomas and Harden's thematic synthesis framework. RESULTS A total of 100 studies were included. An overarching theme, "Forging the Future: Bridging Theory and Integration of ChatGPT" emerged, accompanied by two main themes (1) Enhancing Healthcare Education, Research, and Writing with ChatGPT, (2) Controversies and Concerns about ChatGPT in Healthcare Education Research and Writing, and seven subthemes. CONCLUSIONS Our review underscores the importance of acknowledging legitimate concerns related to the potential misuse of ChatGPT such as 'ChatGPT hallucinations', its limited understanding of specialized healthcare knowledge, its impact on teaching methods and assessments, confidentiality and security risks, and the controversial practice of crediting it as a co-author on scientific papers, among other considerations. Furthermore, our review also recognizes the urgency of establishing timely guidelines and regulations, along with the active engagement of relevant stakeholders, to ensure the responsible and safe implementation of ChatGPT's capabilities. We advocate for the use of cross-verification techniques to enhance the precision and reliability of generated content, the adaptation of higher education curricula to incorporate ChatGPT's potential, educators' need to familiarize themselves with the technology to improve their literacy and teaching approaches, and the development of innovative methods to detect ChatGPT usage. Furthermore, data protection measures should be prioritized when employing ChatGPT, and transparent reporting becomes crucial when integrating ChatGPT into academic writing.
Collapse
Affiliation(s)
- Shefaly Shorey
- Alice Lee Centre for Nursing Studies, Yong Loo Lin School of Medicine, National University of Singapore, Singapore.
| | - Citra Mattar
- Division of Maternal Fetal Medicine, Department of Obstetrics and Gynaecology, National University Health Systems, Singapore; Department of Obstetrics and Gynaecology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Travis Lanz-Brian Pereira
- Alice Lee Centre for Nursing Studies, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Mahesh Choolani
- Division of Maternal Fetal Medicine, Department of Obstetrics and Gynaecology, National University Health Systems, Singapore; Department of Obstetrics and Gynaecology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| |
Collapse
|
58
|
Breeding T, Martinez B, Patel H, Nasef H, Arif H, Nakayama D, Elkbuli A. The Utilization of ChatGPT in Reshaping Future Medical Education and Learning Perspectives: A Curse or a Blessing? Am Surg 2024; 90:560-566. [PMID: 37309705 DOI: 10.1177/00031348231180950] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
BACKGROUND ChatGPT has substantial potential to revolutionize medical education. We aim to assess how medical students and laypeople evaluate information produced by ChatGPT compared to an evidence-based resource on the diagnosis and management of 5 common surgical conditions. METHODS A 60-question anonymous online survey was distributed to third- and fourth-year U.S. medical students and laypeople to evaluate articles produced by ChatGPT and an evidence-based source on clarity, relevance, reliability, validity, organization, and comprehensiveness. Participants received 2 blinded articles, 1 from each source, for each surgical condition. Paired-sample t-tests were used to compare ratings between the 2 sources. RESULTS Of 56 survey participants, 50.9% (n = 28) were U.S. medical students and 49.1% (n = 27) were from the general population. Medical students reported that ChatGPT articles displayed significantly more clarity (appendicitis: 4.39 vs 3.89, P = .020; diverticulitis: 4.54 vs 3.68, P < .001; SBO 4.43 vs 3.79, P = .003; GI bleed: 4.36 vs 3.93, P = .020) and better organization (diverticulitis: 4.36 vs 3.68, P = .021; SBO: 4.39 vs 3.82, P = .033) than the evidence-based source. However, for all 5 conditions, medical students found evidence-based passages to be more comprehensive than ChatGPT articles (cholecystitis: 4.04 vs 3.36, P = .009; appendicitis: 4.07 vs 3.36, P = .015; diverticulitis: 4.07 vs 3.36, P = .015; small bowel obstruction: 4.11 vs 3.54, P = .030; upper GI bleed: 4.11 vs 3.29, P = .003). CONCLUSION Medical students perceived ChatGPT articles to be clearer and better organized than evidence-based sources on the pathogenesis, diagnosis, and management of 5 common surgical pathologies. However, evidence-based articles were rated as significantly more comprehensive.
Collapse
Affiliation(s)
- Tessa Breeding
- Kiran Patel College of Allopathic Medicine, NOVA Southeastern University, Fort Lauderdale, FL, USA
| | - Brian Martinez
- Kiran Patel College of Allopathic Medicine, NOVA Southeastern University, Fort Lauderdale, FL, USA
| | - Heli Patel
- Kiran Patel College of Allopathic Medicine, NOVA Southeastern University, Fort Lauderdale, FL, USA
| | - Hazem Nasef
- Kiran Patel College of Allopathic Medicine, NOVA Southeastern University, Fort Lauderdale, FL, USA
| | - Hasan Arif
- Kiran Patel College of Allopathic Medicine, NOVA Southeastern University, Fort Lauderdale, FL, USA
| | - Don Nakayama
- Mercer University School of Medicine, Columbus, GA, USA
- Department of Pediatric Surgery, Piedmont Columbus Regional Hospital, Piedmont, GA, USA
| | - Adel Elkbuli
- Department of Surgery, Division of Trauma and Surgical Critical Care, Orlando Regional Medical Center, Orlando, FL, USA
- Department of Surgical Education, Orlando Regional Medical Center, Orlando, FL, USA
| |
Collapse
|
59
|
Halawani A, Mitchell A, Saffarzadeh M, Wong V, Chew BH, Forbes CM. Accuracy and Readability of Kidney Stone Patient Information Materials Generated by a Large Language Model Compared to Official Urologic Organizations. Urology 2024; 186:107-113. [PMID: 38395071 DOI: 10.1016/j.urology.2023.11.042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 10/29/2023] [Accepted: 11/07/2023] [Indexed: 02/25/2024]
Abstract
OBJECTIVE To compare the readability and accuracy of large language model generated patient information materials (PIMs) to those supplied by the American Urological Association (AUA), Canadian Urological Association (CUA), and European Association of Urology (EAU) for kidney stones. METHODS PIMs from AUA, CUA, and EAU related to nephrolithiasis were obtained and categorized. The most frequent patient questions related to kidney stones were identified from an internet query and input into GPT-3.5 and GPT-4. PIMs and ChatGPT outputs were assessed for accuracy and readability using previously published indexes. We also assessed changes in ChatGPT outputs when a reading level was specified (grade 6). RESULTS Readability scores were better for PIMs from the CUA (grade level 10-12), AUA (8-10), or EAU (9-11) compared to the chatbot. GPT-3.5 had the worst readability scores at grade 13-14 and GPT-4 was likewise less readable than urologic organization PIMs with scores of 11-13. While organizational PIMs were deemed to be accurate, the chatbot had high accuracy with minor details omitted. GPT-4 was more accurate in general stone information, dietary and medical management of kidney stones topics in comparison to GPT-3.5, while both models had the same accuracy in the surgical management of nephrolithiasis topics. CONCLUSION Current PIMs from major urologic organizations for kidney stones remain more readable than publicly available GPT outputs, but they are still higher than the reading ability of the general population. Of the available PIMs for kidney stones, those from the AUA are the most readable. Although Chatbot outputs for common kidney stone patient queries have a high degree of accuracy with minor omitted details, it is important for clinicians to understand their strengths and limitations.
Collapse
Affiliation(s)
- Abdulghafour Halawani
- Department of Urology, King Abdulaziz University, Jeddah, Saudi Arabia; Department of Urological Sciences, University of British Columbia, Stone Centre at Vancouver General Hospital, Vancouver, British Columbia, Canada
| | - Alec Mitchell
- Department of Urological Sciences, University of British Columbia, Stone Centre at Vancouver General Hospital, Vancouver, British Columbia, Canada
| | - Mohammadali Saffarzadeh
- Department of Urological Sciences, University of British Columbia, Stone Centre at Vancouver General Hospital, Vancouver, British Columbia, Canada
| | - Victor Wong
- Department of Urological Sciences, University of British Columbia, Stone Centre at Vancouver General Hospital, Vancouver, British Columbia, Canada
| | - Ben H Chew
- Department of Urological Sciences, University of British Columbia, Stone Centre at Vancouver General Hospital, Vancouver, British Columbia, Canada
| | - Connor M Forbes
- Department of Urological Sciences, University of British Columbia, Stone Centre at Vancouver General Hospital, Vancouver, British Columbia, Canada; Vancouver Prostate Centre, Vancouver, British Columbia, Canada.
| |
Collapse
|
60
|
Zampatti S, Peconi C, Megalizzi D, Calvino G, Trastulli G, Cascella R, Strafella C, Caltagirone C, Giardina E. Innovations in Medicine: Exploring ChatGPT's Impact on Rare Disorder Management. Genes (Basel) 2024; 15:421. [PMID: 38674356 PMCID: PMC11050022 DOI: 10.3390/genes15040421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 03/25/2024] [Accepted: 03/26/2024] [Indexed: 04/28/2024] Open
Abstract
Artificial intelligence (AI) is rapidly transforming the field of medicine, announcing a new era of innovation and efficiency. Among AI programs designed for general use, ChatGPT holds a prominent position, using an innovative language model developed by OpenAI. Thanks to the use of deep learning techniques, ChatGPT stands out as an exceptionally viable tool, renowned for generating human-like responses to queries. Various medical specialties, including rheumatology, oncology, psychiatry, internal medicine, and ophthalmology, have been explored for ChatGPT integration, with pilot studies and trials revealing each field's potential benefits and challenges. However, the field of genetics and genetic counseling, as well as that of rare disorders, represents an area suitable for exploration, with its complex datasets and the need for personalized patient care. In this review, we synthesize the wide range of potential applications for ChatGPT in the medical field, highlighting its benefits and limitations. We pay special attention to rare and genetic disorders, aiming to shed light on the future roles of AI-driven chatbots in healthcare. Our goal is to pave the way for a healthcare system that is more knowledgeable, efficient, and centered around patient needs.
Collapse
Affiliation(s)
- Stefania Zampatti
- Genomic Medicine Laboratory UILDM, IRCCS Santa Lucia Foundation, 00179 Rome, Italy; (S.Z.)
| | - Cristina Peconi
- Genomic Medicine Laboratory UILDM, IRCCS Santa Lucia Foundation, 00179 Rome, Italy; (S.Z.)
| | - Domenica Megalizzi
- Genomic Medicine Laboratory UILDM, IRCCS Santa Lucia Foundation, 00179 Rome, Italy; (S.Z.)
- Department of Science, Roma Tre University, 00146 Rome, Italy
| | - Giulia Calvino
- Genomic Medicine Laboratory UILDM, IRCCS Santa Lucia Foundation, 00179 Rome, Italy; (S.Z.)
- Department of Science, Roma Tre University, 00146 Rome, Italy
| | - Giulia Trastulli
- Genomic Medicine Laboratory UILDM, IRCCS Santa Lucia Foundation, 00179 Rome, Italy; (S.Z.)
- Department of System Medicine, Tor Vergata University, 00133 Rome, Italy
| | - Raffaella Cascella
- Genomic Medicine Laboratory UILDM, IRCCS Santa Lucia Foundation, 00179 Rome, Italy; (S.Z.)
- Department of Chemical-Toxicological and Pharmacological Evaluation of Drugs, Catholic University Our Lady of Good Counsel, 1000 Tirana, Albania
| | - Claudia Strafella
- Genomic Medicine Laboratory UILDM, IRCCS Santa Lucia Foundation, 00179 Rome, Italy; (S.Z.)
| | - Carlo Caltagirone
- Department of Clinical and Behavioral Neurology, IRCCS Fondazione Santa Lucia, 00179 Rome, Italy;
| | - Emiliano Giardina
- Genomic Medicine Laboratory UILDM, IRCCS Santa Lucia Foundation, 00179 Rome, Italy; (S.Z.)
- Department of Biomedicine and Prevention, Tor Vergata University, 00133 Rome, Italy
| |
Collapse
|
61
|
Popkov AA, Barrett TS. AI vs academia: Experimental study on AI text detectors' accuracy in behavioral health academic writing. Account Res 2024:1-17. [PMID: 38516933 DOI: 10.1080/08989621.2024.2331757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Accepted: 03/13/2024] [Indexed: 03/23/2024]
Abstract
Artificial Intelligence (AI) language models continue to expand in both access and capability. As these models have evolved, the number of academic journals in medicine and healthcare which have explored policies regarding AI-generated text has increased. The implementation of such policies requires accurate AI detection tools. Inaccurate detectors risk unnecessary penalties for human authors and/or may compromise the effective enforcement of guidelines against AI-generated content. Yet, the accuracy of AI text detection tools in identifying human-written versus AI-generated content has been found to vary across published studies. This experimental study used a sample of behavioral health publications and found problematic false positive and false negative rates from both free and paid AI detection tools. The study assessed 100 research articles from 2016-2018 in behavioral health and psychiatry journals and 200 texts produced by AI chatbots (100 by "ChatGPT" and 100 by "Claude"). The free AI detector showed a median of 27.2% for the proportion of academic text identified as AI-generated, while commercial software Originality.AI demonstrated better performance but still had limitations, especially in detecting texts generated by Claude. These error rates raise doubts about relying on AI detectors to enforce strict policies around AI text generation in behavioral health publications.
Collapse
Affiliation(s)
- Andrey A Popkov
- Highmark Health, Pittsburgh, PA, USA
- Contigo Health, LLC, a subsidiary of Premier, Inc, Charlotte, NC, USA
| | | |
Collapse
|
62
|
Sánchez-Rosenberg G, Magnéli M, Barle N, Kontakis MG, Müller AM, Wittauer M, Gordon M, Brodén C. ChatGPT-4 generates orthopedic discharge documents faster than humans maintaining comparable quality: a pilot study of 6 cases. Acta Orthop 2024; 95:152-156. [PMID: 38597205 PMCID: PMC10959013 DOI: 10.2340/17453674.2024.40182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 01/28/2024] [Indexed: 04/11/2024] Open
Abstract
BACKGROUND AND PURPOSE Large language models like ChatGPT-4 have emerged. They hold the potential to reduce the administrative burden by generating everyday clinical documents, thus allowing the physician to spend more time with the patient. We aimed to assess both the quality and efficiency of discharge documents generated by ChatGPT-4 in comparison with those produced by physicians. PATIENTS AND METHODS To emulate real-world situations, the health records of 6 fictional orthopedic cases were created. Discharge documents for each case were generated by a junior attending orthopedic surgeon and an advanced orthopedic resident. ChatGPT-4 was then prompted to generate the discharge documents using the same health record information. The quality assessment was performed by an expert panel (n = 15) blinded to the source of the documents. As secondary outcome, the time required to generate the documents was compared, logging the duration of the creation of the discharge documents by the physician and by ChatGPT-4. RESULTS Overall, both ChatGPT-4 and physician-generated notes were comparable in quality. Notably, ChatGPT-4 generated discharge documents 10 times faster than the traditional method. 4 events of hallucinations were found in the ChatGPT-4-generated content, compared with 6 events in the human/physician produced notes. CONCLUSION ChatGPT-4 creates orthopedic discharge notes faster than physicians, with comparable quality. This shows it has great potential for making these documents more efficient in orthopedic care. ChatGPT-4 has the potential to significantly reduce the administrative burden on healthcare professionals.
Collapse
Affiliation(s)
| | - Martin Magnéli
- Karolinska Institute, Department of Clinical Sciences at Danderyd Hospital, Stockholm; Sweden
| | - Niklas Barle
- Karolinska Institute, Department of Clinical Sciences at Danderyd Hospital, Stockholm; Sweden
| | - Michael G Kontakis
- Department of Surgical Sciences, Orthopedics, Uppsala University Hospital, Uppsala, Sweden
| | - Andreas Marc Müller
- Department of Orthopedic and Trauma Surgery, University Hospital Basel, Switzerland
| | - Matthias Wittauer
- Department of Orthopedic and Trauma Surgery, University Hospital Basel, Switzerland
| | - Max Gordon
- Karolinska Institute, Department of Clinical Sciences at Danderyd Hospital, Stockholm; Sweden
| | - Cyrus Brodén
- Department of Surgical Sciences, Orthopedics, Uppsala University Hospital, Uppsala, Sweden.
| |
Collapse
|
63
|
Surapaneni KM, Rajajagadeesan A, Goudhaman L, Lakshmanan S, Sundaramoorthi S, Ravi D, Rajendiran K, Swaminathan P. Evaluating ChatGPT as a self-learning tool in medical biochemistry: A performance assessment in undergraduate medical university examination. BIOCHEMISTRY AND MOLECULAR BIOLOGY EDUCATION : A BIMONTHLY PUBLICATION OF THE INTERNATIONAL UNION OF BIOCHEMISTRY AND MOLECULAR BIOLOGY 2024; 52:237-248. [PMID: 38112255 DOI: 10.1002/bmb.21808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 10/24/2023] [Accepted: 12/04/2023] [Indexed: 12/21/2023]
Abstract
The emergence of ChatGPT as one of the most advanced chatbots and its ability to generate diverse data has given room for numerous discussions worldwide regarding its utility, particularly in advancing medical education and research. This study seeks to assess the performance of ChatGPT in medical biochemistry to evaluate its potential as an effective self-learning tool for medical students. This evaluation was carried out using the university examination question papers of both parts 1 and 2 of medical biochemistry which comprised theory and multiple choice questions (MCQs) accounting for a total of 100 in each part. The questions were used to interact with ChatGPT, and three raters independently reviewed and scored the answers to prevent bias in scoring. We conducted the inter-item correlation matrix and the interclass correlation between raters 1, 2, and 3. For MCQs, symmetric measures in the form of kappa value (a measure of agreement) were performed between raters 1, 2, and 3. ChatGPT generated relevant and appropriate answers to all questions along with explanations for MCQs. ChatGPT has "passed" the medical biochemistry university examination with an average score of 117 out of 200 (58%) in both papers. In Paper 1, ChatGPT has secured 60 ± 2.29 and 57 ± 4.36 in Paper 2. The kappa value for all the cross-analysis of Rater 1, Rater 2, and Rater 3 scores in MCQ was 1.000. The evaluation of ChatGPT as a self-learning tool in medical biochemistry has yielded important insights. While it is encouraging that ChatGPT has demonstrated proficiency in this area, the overall score of 58% indicates that there is work to be done. To unlock its full potential as a self-learning tool, ChatGPT must focus on generating not only accurate but also comprehensive and contextually relevant content.
Collapse
Affiliation(s)
- Krishna Mohan Surapaneni
- Department of Biochemistry, Panimalar Medical College Hospital & Research Institute, Chennai, India
- Department of Medical Education, Panimalar Medical College Hospital & Research Institute, Chennai, India
- Department of Clinical Skills & Simulation, Panimalar Medical College Hospital & Research Institute, Chennai, India
| | - Anusha Rajajagadeesan
- Department of Biochemistry, Panimalar Medical College Hospital & Research Institute, Chennai, India
| | - Lakshmi Goudhaman
- Department of Biochemistry, Panimalar Medical College Hospital & Research Institute, Chennai, India
| | - Shalini Lakshmanan
- Department of Biochemistry, Panimalar Medical College Hospital & Research Institute, Chennai, India
| | - Saranya Sundaramoorthi
- Department of Biochemistry, Panimalar Medical College Hospital & Research Institute, Chennai, India
| | - Dineshkumar Ravi
- Department of Biochemistry, Panimalar Medical College Hospital & Research Institute, Chennai, India
| | - Kalaiselvi Rajendiran
- Department of Biochemistry, Panimalar Medical College Hospital & Research Institute, Chennai, India
| | - Porchelvan Swaminathan
- Department of Community Medicine, Panimalar Medical College Hospital & Research Institute, Chennai, India
| |
Collapse
|
64
|
Tang A, Li KK, Kwok KO, Cao L, Luong S, Tam W. The importance of transparency: Declaring the use of generative artificial intelligence (AI) in academic writing. J Nurs Scholarsh 2024; 56:314-318. [PMID: 37904646 DOI: 10.1111/jnu.12938] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 08/21/2023] [Accepted: 10/18/2023] [Indexed: 11/01/2023]
Abstract
The integration of generative artificial intelligence (AI) into academic research writing has revolutionized the field, offering powerful tools like ChatGPT and Bard to aid researchers in content generation and idea enhancement. We explore the current state of transparency regarding generative AI use in nursing academic research journals, emphasizing the need for explicitly declaring the use of generative AI by authors in the manuscript. Out of 125 nursing studies journals, 37.6% required explicit statements about generative AI use in their authors' guidelines. No significant differences in impact factors or journal categories were found between journals with and without such requirement. A similar evaluation of medicine, general and internal journals showed a lower percentage (14.5%) including the information about generative AI usage. Declaring generative AI tool usage is crucial for maintaining the transparency and credibility in academic writing. Additionally, extending the requirement for AI usage declarations to journal reviewers can enhance the quality of peer review and combat predatory journals in the academic publishing landscape. Our study highlights the need for active participation from nursing researchers in discussions surrounding standardization of generative AI declaration in academic research writing.
Collapse
Affiliation(s)
- Arthur Tang
- School of Science, Engineering and Technology, RMIT University, Ho Chi Minh City, Vietnam
| | - Kin-Kit Li
- Department of Social and Behavioural Sciences, City University of Hong Kong, Hong Kong, Hong Kong Special Administrative Region
| | - Kin On Kwok
- JC School of Public Health and Primary Care, The Chinese University of Hong Kong, Hong Kong, Hong Kong Special Administrative Region
- Stanley Ho Centre for Emerging Infectious Diseases, The Chinese University of Hong Kong, Hong Kong, Hong Kong Special Administrative Region
- Hong Kong Institute of Asia-Pacific Studies, The Chinese University of Hong Kong, Hong Kong, Hong Kong Special Administrative Region
- Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, UK
| | - Liujiao Cao
- West China School of Nursing/West China Hospital, Sichuan University, Chengdu, China
| | - Stanley Luong
- School of Science, Engineering and Technology, RMIT University, Ho Chi Minh City, Vietnam
| | - Wilson Tam
- Alice Lee Centre for Nursing Studies, National University of Singapore, Singapore
| |
Collapse
|
65
|
Ashraf H, Ashfaq H. The Role of ChatGPT in Medical Research: Progress and Limitations. Ann Biomed Eng 2024; 52:458-461. [PMID: 37452215 DOI: 10.1007/s10439-023-03311-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 07/06/2023] [Indexed: 07/18/2023]
Abstract
Advancements in AI have resulted in the development of sophisticated language models like ChatGPT, which can generate human-like text. While ChatGPT is useful for clarifying concepts and providing basic guidance, it has limitations. It lacks the ability to provide the latest scientific information and access original medical databases. Studies have shown that ChatGPT's text can be robotic, shallow, and lacking a human touch. It has also been found to provide misleading or inaccurate information. Researchers and medical professionals should be aware of these limitations and not solely rely on ChatGPT for complex tasks. The human element and real-world experiences are indispensable in science, and consulting experts is advisable for reliable insights.
Collapse
Affiliation(s)
- Hamza Ashraf
- Allama Iqbal Medical College, Lahore, Punjab, Pakistan.
| | - Haider Ashfaq
- Allama Iqbal Medical College, Lahore, Punjab, Pakistan
| |
Collapse
|
66
|
Aguiar de Sousa R, Costa SM, Almeida Figueiredo PH, Camargos CR, Ribeiro BC, Alves E Silva MRM. Is ChatGPT a reliable source of scientific information regarding third-molar surgery? J Am Dent Assoc 2024; 155:227-232.e6. [PMID: 38206257 DOI: 10.1016/j.adaj.2023.11.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 11/04/2023] [Accepted: 11/13/2023] [Indexed: 01/12/2024]
Abstract
BACKGROUND ChatGPT (OpenAI) is a large language model. This model uses artificial intelligence and machine learning techniques to generate humanlike language and responses, even to complex questions. The authors aimed to assess the reliability of responses provided via ChatGPT and evaluate its trustworthiness as a means of obtaining information about third-molar surgery. METHODS The authors assessed the 10 most frequently asked questions about mandibular third-molar extraction. A validated questionnaire (Chatbot Usability Questionnaire) was used and 2 oral and maxillofacial surgeons compared the answers provided with the literature. RESULTS Most of the responses (90.63%) provided via the ChatGPT platform were considered safe and accurate and followed what was the stated in the English-language literature. CONCLUSIONS The ChatGPT platform offers accurate and scientifically backed answers to inquiries about third-molar surgical extraction, making it a dependable and easy-to-use resource for both patients and the general public. However, the platform should provide references with the responses to validate the information. PRACTICAL IMPLICATIONS Patients worldwide are exposed to reliable information sources. Oral surgeons and health care providers should always advise patients to be aware of the information source and that the ChatGPT platform offers a reliable solution.
Collapse
|
67
|
Li J, Dada A, Puladi B, Kleesiek J, Egger J. ChatGPT in healthcare: A taxonomy and systematic review. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 245:108013. [PMID: 38262126 DOI: 10.1016/j.cmpb.2024.108013] [Citation(s) in RCA: 64] [Impact Index Per Article: 64.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 12/29/2023] [Accepted: 01/08/2024] [Indexed: 01/25/2024]
Abstract
The recent release of ChatGPT, a chat bot research project/product of natural language processing (NLP) by OpenAI, stirs up a sensation among both the general public and medical professionals, amassing a phenomenally large user base in a short time. This is a typical example of the 'productization' of cutting-edge technologies, which allows the general public without a technical background to gain firsthand experience in artificial intelligence (AI), similar to the AI hype created by AlphaGo (DeepMind Technologies, UK) and self-driving cars (Google, Tesla, etc.). However, it is crucial, especially for healthcare researchers, to remain prudent amidst the hype. This work provides a systematic review of existing publications on the use of ChatGPT in healthcare, elucidating the 'status quo' of ChatGPT in medical applications, for general readers, healthcare professionals as well as NLP scientists. The large biomedical literature database PubMed is used to retrieve published works on this topic using the keyword 'ChatGPT'. An inclusion criterion and a taxonomy are further proposed to filter the search results and categorize the selected publications, respectively. It is found through the review that the current release of ChatGPT has achieved only moderate or 'passing' performance in a variety of tests, and is unreliable for actual clinical deployment, since it is not intended for clinical applications by design. We conclude that specialized NLP models trained on (bio)medical datasets still represent the right direction to pursue for critical clinical applications.
Collapse
Affiliation(s)
- Jianning Li
- Institute for Artificial Intelligence in Medicine, University Hospital Essen (AöR), Girardetstraße 2, 45131 Essen, Germany
| | - Amin Dada
- Institute for Artificial Intelligence in Medicine, University Hospital Essen (AöR), Girardetstraße 2, 45131 Essen, Germany
| | - Behrus Puladi
- Institute of Medical Informatics, University Hospital RWTH Aachen, Pauwelsstraße 30, 52074 Aachen, Germany; Department of Oral and Maxillofacial Surgery, University Hospital RWTH Aachen, Pauwelsstraße 30, 52074 Aachen, Germany
| | - Jens Kleesiek
- Institute for Artificial Intelligence in Medicine, University Hospital Essen (AöR), Girardetstraße 2, 45131 Essen, Germany; TU Dortmund University, Department of Physics, Otto-Hahn-Straße 4, 44227 Dortmund, Germany
| | - Jan Egger
- Institute for Artificial Intelligence in Medicine, University Hospital Essen (AöR), Girardetstraße 2, 45131 Essen, Germany; Center for Virtual and Extended Reality in Medicine (ZvRM), University Hospital Essen, University Medicine Essen, Hufelandstraße 55, 45147 Essen, Germany.
| |
Collapse
|
68
|
Mihalache A, Huang RS, Popovic MM, Muni RH. ChatGPT-4: An assessment of an upgraded artificial intelligence chatbot in the United States Medical Licensing Examination. MEDICAL TEACHER 2024; 46:366-372. [PMID: 37839017 DOI: 10.1080/0142159x.2023.2249588] [Citation(s) in RCA: 42] [Impact Index Per Article: 42.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/17/2023]
Abstract
PURPOSE ChatGPT-4 is an upgraded version of an artificial intelligence chatbot. The performance of ChatGPT-4 on the United States Medical Licensing Examination (USMLE) has not been independently characterized. We aimed to assess the performance of ChatGPT-4 at responding to USMLE Step 1, Step 2CK, and Step 3 practice questions. METHOD Practice multiple-choice questions for the USMLE Step 1, Step 2CK, and Step 3 were compiled. Of 376 available questions, 319 (85%) were analyzed by ChatGPT-4 on March 21st, 2023. Our primary outcome was the performance of ChatGPT-4 for the practice USMLE Step 1, Step 2CK, and Step 3 examinations, measured as the proportion of multiple-choice questions answered correctly. Our secondary outcomes were the mean length of questions and responses provided by ChatGPT-4. RESULTS ChatGPT-4 responded to 319 text-based multiple-choice questions from USMLE practice test material. ChatGPT-4 answered 82 of 93 (88%) questions correctly on USMLE Step 1, 91 of 106 (86%) on Step 2CK, and 108 of 120 (90%) on Step 3. ChatGPT-4 provided explanations for all questions. ChatGPT-4 spent 30.8 ± 11.8 s on average responding to practice questions for USMLE Step 1, 23.0 ± 9.4 s per question for Step 2CK, and 23.1 ± 8.3 s per question for Step 3. The mean length of practice USMLE multiple-choice questions that were answered correctly and incorrectly by ChatGPT-4 was similar (difference = 17.48 characters, SE = 59.75, 95%CI = [-100.09,135.04], t = 0.29, p = 0.77). The mean length of ChatGPT-4's correct responses to practice questions was significantly shorter than the mean length of incorrect responses (difference = 79.58 characters, SE = 35.42, 95%CI = [9.89,149.28], t = 2.25, p = 0.03). CONCLUSIONS ChatGPT-4 answered a remarkably high proportion of practice questions correctly for USMLE examinations. ChatGPT-4 performed substantially better at USMLE practice questions than previous models of the same AI chatbot.
Collapse
Affiliation(s)
- Andrew Mihalache
- Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Ryan S Huang
- Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Marko M Popovic
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
| | - Rajeev H Muni
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
- Department of Ophthalmology, St. Michael's Hospital/Unity Health Toronto, Toronto, Ontario, Canada
| |
Collapse
|
69
|
Abdelhafiz AS, Ali A, Maaly AM, Ziady HH, Sultan EA, Mahgoub MA. Knowledge, Perceptions and Attitude of Researchers Towards Using ChatGPT in Research. J Med Syst 2024; 48:26. [PMID: 38411833 PMCID: PMC10899415 DOI: 10.1007/s10916-024-02044-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Accepted: 02/10/2024] [Indexed: 02/28/2024]
Abstract
INTRODUCTION ChatGPT, a recently released chatbot from OpenAI, has found applications in various aspects of life, including academic research. This study investigated the knowledge, perceptions, and attitudes of researchers towards using ChatGPT and other chatbots in academic research. METHODS A pre-designed, self-administered survey using Google Forms was employed to conduct the study. The questionnaire assessed participants' knowledge of ChatGPT and other chatbots, their awareness of current chatbot and artificial intelligence (AI) applications, and their attitudes towards ChatGPT and its potential research uses. RESULTS Two hundred researchers participated in the survey. A majority were female (57.5%), and over two-thirds belonged to the medical field (68%). While 67% had heard of ChatGPT, only 11.5% had employed it in their research, primarily for rephrasing paragraphs and finding references. Interestingly, over one-third supported the notion of listing ChatGPT as an author in scientific publications. Concerns emerged regarding AI's potential to automate researcher tasks, particularly in language editing, statistics, and data analysis. Additionally, roughly half expressed ethical concerns about using AI applications in scientific research. CONCLUSION The increasing use of chatbots in academic research necessitates thoughtful regulation that balances potential benefits with inherent limitations and potential risks. Chatbots should not be considered authors of scientific publications but rather assistants to researchers during manuscript preparation and review. Researchers should be equipped with proper training to utilize chatbots and other AI tools effectively and ethically.
Collapse
Affiliation(s)
- Ahmed Samir Abdelhafiz
- Department of Clinical pathology, National Cancer Institute, Cairo University, Kasr Al-Aini Street, from Elkhalig Square, Cairo, 11796, Egypt.
| | - Asmaa Ali
- Department of Pulmonary Medicine, Abbassia Chest Hospital, Ministry of Health and Population, Cairo, Egypt
| | - Ayman Mohamed Maaly
- Department of Anaesthesia and Surgical Intensive Care, Faculty of Medicine, Alexandria University, Alexandria, Egypt
| | - Hany Hassan Ziady
- Department of Community Medicine, Faculty of Medicine, Alexandria University, Alexandria, Egypt
| | - Eman Anwar Sultan
- Department of Community Medicine, Faculty of Medicine, Alexandria University, Alexandria, Egypt
| | - Mohamed Anwar Mahgoub
- Department of Microbiology, High Institute of Public Health, Alexandria University, Alexandria, Egypt
| |
Collapse
|
70
|
Scheschenja M, Viniol S, Bastian MB, Wessendorf J, König AM, Mahnken AH. Feasibility of GPT-3 and GPT-4 for in-Depth Patient Education Prior to Interventional Radiological Procedures: A Comparative Analysis. Cardiovasc Intervent Radiol 2024; 47:245-250. [PMID: 37872295 PMCID: PMC10844465 DOI: 10.1007/s00270-023-03563-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 09/09/2023] [Indexed: 10/25/2023]
Abstract
PURPOSE This study explores the utility of the large language models, GPT-3 and GPT-4, for in-depth patient education prior to interventional radiology procedures. Further, differences in answer accuracy between the models were assessed. MATERIALS AND METHODS A total of 133 questions related to three specific interventional radiology procedures (Port implantation, PTA and TACE) covering general information as well as preparation details, risks and complications and post procedural aftercare were compiled. Responses of GPT-3 and GPT-4 were assessed for their accuracy by two board-certified radiologists using a 5-point Likert scale. The performance difference between GPT-3 and GPT-4 was analyzed. RESULTS Both GPT-3 and GPT-4 responded with (5) "completely correct" (4) "very good" answers for the majority of questions ((5) 30.8% + (4) 48.1% for GPT-3 and (5) 35.3% + (4) 47.4% for GPT-4). GPT-3 and GPT-4 provided (3) "acceptable" responses 15.8% and 15.0% of the time, respectively. GPT-3 provided (2) "mostly incorrect" responses in 5.3% of instances, while GPT-4 had a lower rate of such occurrences, at just 2.3%. No response was identified as potentially harmful. GPT-4 was found to give significantly more accurate responses than GPT-3 (p = 0.043). CONCLUSION GPT-3 and GPT-4 emerge as relatively safe and accurate tools for patient education in interventional radiology. GPT-4 showed a slightly better performance. The feasibility and accuracy of these models suggest their promising role in revolutionizing patient care. Still, users need to be aware of possible limitations.
Collapse
Affiliation(s)
- Michael Scheschenja
- Department of Diagnostic and Interventional Radiology, University Hospital Marburg, Philipps-University of Marburg, Baldingerstrasse 1, 35043, Marburg, DE, Germany.
| | - Simon Viniol
- Department of Diagnostic and Interventional Radiology, University Hospital Marburg, Philipps-University of Marburg, Baldingerstrasse 1, 35043, Marburg, DE, Germany
| | - Moritz B Bastian
- Department of Diagnostic and Interventional Radiology, University Hospital Marburg, Philipps-University of Marburg, Baldingerstrasse 1, 35043, Marburg, DE, Germany
| | - Joel Wessendorf
- Department of Diagnostic and Interventional Radiology, University Hospital Marburg, Philipps-University of Marburg, Baldingerstrasse 1, 35043, Marburg, DE, Germany
| | - Alexander M König
- Department of Diagnostic and Interventional Radiology, University Hospital Marburg, Philipps-University of Marburg, Baldingerstrasse 1, 35043, Marburg, DE, Germany
| | - Andreas H Mahnken
- Department of Diagnostic and Interventional Radiology, University Hospital Marburg, Philipps-University of Marburg, Baldingerstrasse 1, 35043, Marburg, DE, Germany
| |
Collapse
|
71
|
Kapsali MZ, Livanis E, Tsalikidis C, Oikonomou P, Voultsos P, Tsaroucha A. Ethical Concerns About ChatGPT in Healthcare: A Useful Tool or the Tombstone of Original and Reflective Thinking? Cureus 2024; 16:e54759. [PMID: 38523987 PMCID: PMC10961144 DOI: 10.7759/cureus.54759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/23/2024] [Indexed: 03/26/2024] Open
Abstract
Artificial intelligence (AI), the uprising technology of computer science aiming to create digital systems with human behavior and intelligence, seems to have invaded almost every field of modern life. Launched in November 2022, ChatGPT (Chat Generative Pre-trained Transformer) is a textual AI application capable of creating human-like responses characterized by original language and high coherence. Although AI-based language models have demonstrated impressive capabilities in healthcare, ChatGPT has received controversial annotations from the scientific and academic communities. This chatbot already appears to have a massive impact as an educational tool for healthcare professionals and transformative potential for clinical practice and could lead to dramatic changes in scientific research. Nevertheless, rational concerns were raised regarding whether the pre-trained, AI-generated text would be a menace not only for original thinking and new scientific ideas but also for academic and research integrity, as it gets more and more difficult to distinguish its AI origin due to the coherence and fluency of the produced text. This short review aims to summarize the potential applications and the consequential implications of ChatGPT in the three critical pillars of medicine: education, research, and clinical practice. In addition, this paper discusses whether the current use of this chatbot is in compliance with the ethical principles for the safe use of AI in healthcare, as determined by the World Health Organization. Finally, this review highlights the need for an updated ethical framework and the increased vigilance of healthcare stakeholders to harvest the potential benefits and limit the imminent dangers of this new innovative technology.
Collapse
Affiliation(s)
- Marina Z Kapsali
- Postgraduate Program on Bioethics, Laboratory of Bioethics, Democritus University of Thrace, Alexandroupolis, GRC
| | - Efstratios Livanis
- Department of Accounting and Finance, University of Macedonia, Thessaloniki, GRC
| | - Christos Tsalikidis
- Department of General Surgery, Democritus University of Thrace, Alexandroupolis, GRC
| | - Panagoula Oikonomou
- Laboratory of Experimental Surgery, Department of General Surgery, Democritus University of Thrace, Alexandroupolis, GRC
| | - Polychronis Voultsos
- Laboratory of Forensic Medicine & Toxicology (Medical Law and Ethics), School of Medicine, Faculty of Health Sciences, Aristotle University of Thessaloniki, Thessaloniki, GRC
| | - Aleka Tsaroucha
- Department of General Surgery, Democritus University of Thrace, Alexandroupolis, GRC
| |
Collapse
|
72
|
Metersky K, Chandrasekaran K, Rahman R, Haider M, Al-Hamad A. AI-generated vs. student-crafted assignments and implications for evaluating student work in nursing: an exploratory reflection. Int J Nurs Educ Scholarsh 2024; 21:ijnes-2023-0098. [PMID: 39397290 DOI: 10.1515/ijnes-2023-0098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 09/19/2024] [Indexed: 10/15/2024]
Abstract
OBJECTIVES Chat Generative Pre-Trained Transformer (ChatGPT) is an artificial intelligence-powered language model that can generate a unique outputs, in reponse to a user's textual request. This has raised concerns related to academic integrity in nursing education as students may use the platform to generate original assignment content. Subsequently, the objective of this quality improvement project were to explore and identify effective strategies that educators can use to discern AI-generated papers from student-written submissions. METHODS Four nursing students were requested to submit two versions a Letter to the Editor assignement; one assignment that was written by the student; the other, exclusively generated by ChatGPT-3.5. RESULTS AI-generated assignments were typically grammatically well-written, but some of the scholarly references used were outdated, incorrectly cited, or at times completely fabricated,. Additionally, the AI-generated assignments lacked detail and depth. CONCLUSIONS Nursing educators should possess an understanding of the capabilities of ChatGPT-like technologies to further enhance nursing students' knowledge development and to ensure academic integrity is upheld.
Collapse
Affiliation(s)
- Kateryna Metersky
- Daphne Cockwell School of Nursing, Toronto Metropolitan University, Toronto, ON, Canada
| | | | - Rezwana Rahman
- Daphne Cockwell School of Nursing, Toronto Metropolitan University, Toronto, ON, Canada
| | - Murtaza Haider
- Ted Rogers School of Management, Toronto Metropolitan University, Toronto, ON, Canada
| | - Areej Al-Hamad
- Daphne Cockwell School of Nursing, Toronto Metropolitan University, Toronto, ON, Canada
| |
Collapse
|
73
|
Kim TW. Application of artificial intelligence chatbots, including ChatGPT, in education, scholarly work, programming, and content generation and its prospects: a narrative review. JOURNAL OF EDUCATIONAL EVALUATION FOR HEALTH PROFESSIONS 2023; 20:38. [PMID: 38148495 DOI: 10.3352/jeehp.2023.20.38] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Accepted: 12/26/2023] [Indexed: 12/28/2023]
Abstract
This study aims to explore ChatGPT’s (GPT-3.5 version) functionalities, including reinforcement learning, diverse applications, and limitations. ChatGPT is an artificial intelligence (AI) chatbot powered by OpenAI’s Generative Pre-trained Transformer (GPT) model. The chatbot’s applications span education, programming, content generation, and more, demonstrating its versatility. ChatGPT can improve education by creating assignments and offering personalized feedback, as shown by its notable performance in medical exams and the United States Medical Licensing Exam. However, concerns include plagiarism, reliability, and educational disparities. It aids in various research tasks, from design to writing, and has shown proficiency in summarizing and suggesting titles. Its use in scientific writing and language translation is promising, but professional oversight is needed for accuracy and originality. It assists in programming tasks like writing code, debugging, and guiding installation and updates. It offers diverse applications, from cheering up individuals to generating creative content like essays, news articles, and business plans. Unlike search engines, ChatGPT provides interactive, generative responses and understands context, making it more akin to human conversation, in contrast to conventional search engines’ keyword-based, non-interactive nature. ChatGPT has limitations, such as potential bias, dependence on outdated data, and revenue generation challenges. Nonetheless, ChatGPT is considered to be a transformative AI tool poised to redefine the future of generative technology. In conclusion, advancements in AI, such as ChatGPT, are altering how knowledge is acquired and applied, marking a shift from search engines to creativity engines. This transformation highlights the increasing importance of AI literacy and the ability to effectively utilize AI in various domains of life.
Collapse
Affiliation(s)
- Tae Won Kim
- AI‧Future Strategy Center, National Information Society Agency of Korea, Daegu, Korea
| |
Collapse
|
74
|
Liang J, Wang L, Luo J, Yan Y, Fan C. The relationship between student interaction with generative artificial intelligence and learning achievement: serial mediating roles of self-efficacy and cognitive engagement. Front Psychol 2023; 14:1285392. [PMID: 38187430 PMCID: PMC10766754 DOI: 10.3389/fpsyg.2023.1285392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 11/28/2023] [Indexed: 01/09/2024] Open
Abstract
Generative artificial intelligence (GAI) shocked the world with its unprecedented ability and raised significant tensions in the education field. Educators inevitably transition to an educational future that embraces GAI rather than shuns it. Understanding the mechanism between students interacting with GAI tools and their achievement is important for educators and schools, but relevant empirical evidence is relatively lacking. Due to the characteristics of personalization and real-time interactivity of GAI tools, we propose that the students-GAI interaction would affect their learning achievement through serial mediators of self-efficacy and cognitive engagement. Based on questionnaire surveys that include 389 participants as the objective, this study finds that: (1) in total, there is a significantly positive relationship between student-GAI interaction and learning achievement. (2) This positive relationship is mediated by self-efficacy, with a significant mediation effect value of 0.015. (3) Cognitive engagement also acts as a mediator in the mechanism between the student-GAI interaction and learning achievement, evidenced by a significant and relatively strong mediating effect value of 0.046. (4) Self-efficacy and cognitive engagement in series mediate this positive association, with a serial mediating effect value of 0.011, which is relatively small in comparison but also shows significance. In addition, the propensity score matching (PSM) method is applied to alleviate self-selection bias, reinforcing the validity of the results. The findings offer empirical evidence for the incorporation of GAI in teaching and learning.
Collapse
Affiliation(s)
- Jing Liang
- College of Management Science, Chengdu University of Technology, Chengdu, China
| | - Lili Wang
- School of Logistics, Chengdu University of Information Technology, Chengdu, China
| | - Jia Luo
- Business School, Chengdu University, Chengdu, China
| | - Yufei Yan
- Business School, Southwest Minzu University, Chengdu, China
| | - Chao Fan
- College of Management Science, Chengdu University of Technology, Chengdu, China
| |
Collapse
|
75
|
López-Úbeda P, Martín-Noguerol T, Luna A. Reply to the Letter to the Editor: "Radiology in the era of large language models: additional facts to consider in the near and the dark side of the moon". Eur Radiol 2023; 33:9460-9461. [PMID: 37924343 DOI: 10.1007/s00330-023-10331-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 09/15/2023] [Accepted: 09/19/2023] [Indexed: 11/06/2023]
Affiliation(s)
| | | | - Antonio Luna
- MRI Unit, Radiology Department, HT Medica, Jaén, Spain
| |
Collapse
|
76
|
Parker JL, Becker K, Carroca C. ChatGPT for Automated Writing Evaluation in Scholarly Writing Instruction. J Nurs Educ 2023; 62:721-727. [PMID: 38049299 DOI: 10.3928/01484834-20231006-02] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/06/2023]
Abstract
BACKGROUND Effective strategies for developing scholarly writing skills in postsecondary nursing students are needed. Generative artificial intelligence (GAI) tools, such as ChatGPT, for automated writing evaluation (AWE) hold promise for mitigating challenges associated with scholarly writing instruction in nursing education. This article explores the suitability of ChatGPT for AWE in writing instruction. METHOD ChatGPT feedback on 42 nursing student texts from the Michigan Corpus of Upper-Level Student Papers was assessed. Assessment criteria were derived from recent AWE research. RESULTS ChatGPT demonstrated utility as an AWE tool. Its scoring performance demonstrated stricter grading than human raters, related feedback to macro-level writing features, and supported multiple submissions and learner autonomy. CONCLUSION Despite concerns surrounding GAI in academia, educators can accelerate the feedback process without increasing their workload, and students can receive individualized feedback by incorporating AWE provided by ChatGPT into the writing process. [J Nurs Educ. 2023;62(12):721-727.].
Collapse
|
77
|
Hamid H, Zulkifli K, Naimat F, Che Yaacob NL, Ng KW. Exploratory study on student perception on the use of chat AI in process-driven problem-based learning. CURRENTS IN PHARMACY TEACHING & LEARNING 2023; 15:1017-1025. [PMID: 37923639 DOI: 10.1016/j.cptl.2023.10.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 08/08/2023] [Accepted: 10/17/2023] [Indexed: 11/07/2023]
Abstract
INTRODUCTION With the increasing prevalence of artificial intelligence (AI) technology, it is imperative to investigate its influence on education and the resulting impact on student learning outcomes. This includes exploring the potential application of AI in process-driven problem-based learning (PDPBL). This study aimed to investigate the perceptions of students towards the use of ChatGPT) build on GPT-3.5 in PDPBL in the Bachelor of Pharmacy program. METHODS Eighteen students with prior experience in traditional PDPBL processes participated in the study, divided into three groups to perform PDPBL sessions with various triggers from pharmaceutical chemistry, pharmaceutics, and clinical pharmacy fields, while utilizing chat AI provided by ChatGPT to assist with data searching and problem-solving. Questionnaires were used to collect data on the impact of ChatGPT on students' satisfaction, engagement, participation, and learning experience during the PBL sessions. RESULTS The survey revealed that ChatGPT improved group collaboration and engagement during PDPBL, while increasing motivation and encouraging more questions. Nevertheless, some students encountered difficulties understanding ChatGPT's information and questioned its reliability and credibility. Despite these challenges, most students saw ChatGPT's potential to eventually replace traditional information-seeking methods. CONCLUSIONS The study suggests that ChatGPT has the potential to enhance PDPBL in pharmacy education. However, further research is needed to examine the validity and reliability of the information provided by ChatGPT, and its impact on a larger sample size.
Collapse
Affiliation(s)
- Hazrina Hamid
- Faculty of Pharmacy, Lincoln University College, 12-18, Jalan SS 6/12, 47301 Petaling Jaya, Selangor Darul Ehsan, Malaysia
| | - Khadjizah Zulkifli
- Faculty of Pharmacy, Lincoln University College, 12-18, Jalan SS 6/12, 47301 Petaling Jaya, Selangor Darul Ehsan, Malaysia
| | - Faiza Naimat
- Department of Pharmacy, Malaysia National Heart Institute College, 145, Jalan Tun Razak, 50400 Kuala Lumpur, Malaysia
| | - Nor Liana Che Yaacob
- Faculty of Pharmacy, University Sultan Zainal Abidin, 20400 Kuala Terengganu, Terengganu Darul Iman, Malaysia
| | - Kwok Wen Ng
- Faculty of Pharmacy, Quest International University, 227, Jalan Raja Permaisuri Bainun, 30250 Ipoh, Perak, Malaysia.
| |
Collapse
|
78
|
Liu H, Azam M, Bin Naeem S, Faiola A. An overview of the capabilities of ChatGPT for medical writing and its implications for academic integrity. Health Info Libr J 2023; 40:440-446. [PMID: 37806782 DOI: 10.1111/hir.12509] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Accepted: 09/25/2023] [Indexed: 10/10/2023]
Abstract
The artificial intelligence (AI) tool ChatGPT, which is based on a large language model (LLM), is gaining popularity in academic institutions, notably in the medical field. This article provides a brief overview of the capabilities of ChatGPT for medical writing and its implications for academic integrity. It provides a list of AI generative tools, common use of AI generative tools for medical writing, and provides a list of AI generative text detection tools. It also provides recommendations for policymakers, information professionals, and medical faculty for the constructive use of AI generative tools and related technology. It also highlights the role of health sciences librarians and educators in protecting students from generating text through ChatGPT in their academic work.
Collapse
Affiliation(s)
- Huihui Liu
- Shanxi University, Xiaodian District, Taiyuan, People's Republic of China
| | - Mehreen Azam
- Department of Information Management, The Islamia University of Bahawalpur, Bahawalpur, Pakistan
| | - Salman Bin Naeem
- Department of Information Management, The Islamia University of Bahawalpur, Bahawalpur, Pakistan
| | - Anthony Faiola
- Department of Health and Clinical Sciences, College of Health Sciences, University of Kentucky, Lexington, Kentucky, USA
| |
Collapse
|
79
|
Cox RL, Hunt KL, Hill RR. Comparative Analysis of NCLEX-RN Questions: A Duel Between ChatGPT and Human Expertise. J Nurs Educ 2023; 62:679-687. [PMID: 38049305 DOI: 10.3928/01484834-20231006-07] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/06/2023]
Abstract
BACKGROUND Artificial intelligence (AI) has the potential to revolutionize nursing education. This study compared NCLEX-RN questions generated by AI and those created by nurse educators. METHOD Faculty of accredited baccalaureate programs were invited to participate. Likert-scale items for grammar and clarity of the item stem and distractors were compared using Mann-Whitney U, and yes/no questions about clinical relevance and complex terminology were analyzed using chi-square. A one-sample binomial test with confidence intervals evaluated participants' question preference (AI-generated or educator-written). Qualitative responses identified themes across faculty. RESULTS Item clarity, grammar, and difficulty were similar for AI and educator-created questions. Clinical relevance and use of complex terminology was similar for all question pairs. Of the four sets with preference for one item, three were generated by AI. CONCLUSION AI can assist faculty with item generation to prepare nursing students for the NCLEX-RN examination. Faculty expertise is necessary to refine questions written using both methods. [J Nurs Educ. 2023;62(12):679-687.].
Collapse
|
80
|
Silva TP, Ocampo TSC, Alencar-Palha C, Oliveira-Santos C, Takeshita WM, Oliveira ML. ChatGPT: a tool for scientific writing or a threat to integrity? Br J Radiol 2023; 96:20230430. [PMID: 37750843 PMCID: PMC10646664 DOI: 10.1259/bjr.20230430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 08/03/2023] [Accepted: 08/06/2023] [Indexed: 09/27/2023] Open
Abstract
The use of ChatGPT as a tool for writing and knowledge integration raises concerns about the potential for its use to replace critical thinking and academic writing skills. While ChatGPT can assist in generating text and suggesting appropriate language, it should not replace the human responsibility for creating innovative knowledge through experiential learning. The accuracy and quality of information provided by ChatGPT also require caution, as previous studies have reported inaccuracies in references used by chatbots. ChatGPT acknowledges certain limitations, including the potential for generating erroneous or biased content, and it is essential to exercise caution in interpreting its responses and recognize the indispensable role of human experience in the processes of information retrieval and knowledge creation. Furthermore, the challenge of distinguishing between papers written by humans or AI highlights the need for thorough review processes to prevent the spread of articles that could lead to the loss of confidence in the accuracy and integrity of scientific research. Overall, while the use of ChatGPT can be helpful, it is crucial to raise awareness of the potential issues associated with the use of ChatGPT, as well as to discuss boundaries so that AI can be used without compromising the quality of scientific articles and the integrity of evidence-based knowledge.
Collapse
Affiliation(s)
- Thaísa Pinheiro Silva
- Department of Oral Diagnosis, Division of Oral Radiology, Piracicaba Dental School, University of Campinas, Piracicaba, Brazil
| | - Thaís S C Ocampo
- Department of Oral Diagnosis, Division of Oral Radiology, Piracicaba Dental School, University of Campinas, Piracicaba, Brazil
| | - Caio Alencar-Palha
- Department of Oral Diagnosis, Division of Oral Radiology, Piracicaba Dental School, University of Campinas, Piracicaba, Brazil
| | - Christiano Oliveira-Santos
- Department of Diagnosis and Oral Health, University of Louisville School of Dentistry, Louisville, United States
| | - Wilton Mitsunari Takeshita
- Department of Diagnosis and Surgery, Paulista State University Júlio de Mesquita Filho, Araçatuba, Brazil
| | - Matheus L Oliveira
- Department of Oral Diagnosis, Division of Oral Radiology, Piracicaba Dental School, University of Campinas, Piracicaba, Brazil
| |
Collapse
|
81
|
Bhattacharya K, Bhattacharya AS, Bhattacharya N, Yagnik VD, Garg P, Kumar S. ChatGPT in Surgical Practice—a New Kid on the Block. Indian J Surg 2023; 85:1346-1349. [DOI: 10.1007/s12262-023-03727-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 02/16/2023] [Indexed: 02/24/2023] Open
|
82
|
Wattanapisit A, Photia A, Wattanapisit S. Should ChatGPT be considered a medical writer? MALAYSIAN FAMILY PHYSICIAN : THE OFFICIAL JOURNAL OF THE ACADEMY OF FAMILY PHYSICIANS OF MALAYSIA 2023; 18:69. [PMID: 38111835 PMCID: PMC10726750 DOI: 10.51866/lte.483] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Key Words] [Subscribe] [Scholar Register] [Indexed: 12/20/2023]
Affiliation(s)
- Apichai Wattanapisit
- MD, Dip., Thai Board of Family Medicine, Academic Fellowship Department of Clinical Medicine, School of Medicine, Walailak University, Thasala, Nakhon Si Thammarat, Thailand
- Family Medicine Clinic, Walailak University Hospital, Nakhon Si Thammarat, Thailand.
| | - Apichat Photia
- MD, Dip., Thai Board of Pediatrics, Pediatric Hematology and Oncology, Research Fellowship Phramongkutklao Hospital and College of Medicine, Bangkok, Thailand
| | - Sanhapan Wattanapisit
- MD, Dip., Thai Board of Family Medicine, MSc Family Medicine Unit, Thasala Hospital, Nakhon Si Thammarat, Thailand
| |
Collapse
|
83
|
Tian S, Jin Q, Yeganova L, Lai PT, Zhu Q, Chen X, Yang Y, Chen Q, Kim W, Comeau DC, Islamaj R, Kapoor A, Gao X, Lu Z. Opportunities and challenges for ChatGPT and large language models in biomedicine and health. Brief Bioinform 2023; 25:bbad493. [PMID: 38168838 PMCID: PMC10762511 DOI: 10.1093/bib/bbad493] [Citation(s) in RCA: 83] [Impact Index Per Article: 41.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 11/15/2023] [Accepted: 12/06/2023] [Indexed: 01/05/2024] Open
Abstract
ChatGPT has drawn considerable attention from both the general public and domain experts with its remarkable text generation capabilities. This has subsequently led to the emergence of diverse applications in the field of biomedicine and health. In this work, we examine the diverse applications of large language models (LLMs), such as ChatGPT, in biomedicine and health. Specifically, we explore the areas of biomedical information retrieval, question answering, medical text summarization, information extraction and medical education and investigate whether LLMs possess the transformative power to revolutionize these tasks or whether the distinct complexities of biomedical domain presents unique challenges. Following an extensive literature survey, we find that significant advances have been made in the field of text generation tasks, surpassing the previous state-of-the-art methods. For other applications, the advances have been modest. Overall, LLMs have not yet revolutionized biomedicine, but recent rapid progress indicates that such methods hold great potential to provide valuable means for accelerating discovery and improving health. We also find that the use of LLMs, like ChatGPT, in the fields of biomedicine and health entails various risks and challenges, including fabricated information in its generated responses, as well as legal and privacy concerns associated with sensitive patient data. We believe this survey can provide a comprehensive and timely overview to biomedical researchers and healthcare practitioners on the opportunities and challenges associated with using ChatGPT and other LLMs for transforming biomedicine and health.
Collapse
Affiliation(s)
- Shubo Tian
- National Library of Medicine, National Institutes of Health
| | - Qiao Jin
- National Library of Medicine, National Institutes of Health
| | - Lana Yeganova
- National Library of Medicine, National Institutes of Health
| | - Po-Ting Lai
- National Library of Medicine, National Institutes of Health
| | - Qingqing Zhu
- National Library of Medicine, National Institutes of Health
| | - Xiuying Chen
- King Abdullah University of Science and Technology
| | - Yifan Yang
- National Library of Medicine, National Institutes of Health
| | - Qingyu Chen
- National Library of Medicine, National Institutes of Health
| | - Won Kim
- National Library of Medicine, National Institutes of Health
| | | | | | - Aadit Kapoor
- National Library of Medicine, National Institutes of Health
| | - Xin Gao
- King Abdullah University of Science and Technology
| | - Zhiyong Lu
- National Library of Medicine, National Institutes of Health
| |
Collapse
|
84
|
Humar P, Asaad M, Bengur FB, Nguyen V. ChatGPT Is Equivalent to First-Year Plastic Surgery Residents: Evaluation of ChatGPT on the Plastic Surgery In-Service Examination. Aesthet Surg J 2023; 43:NP1085-NP1089. [PMID: 37140001 DOI: 10.1093/asj/sjad130] [Citation(s) in RCA: 68] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 04/27/2023] [Accepted: 04/28/2023] [Indexed: 05/05/2023] Open
Abstract
BACKGROUND ChatGPT is an artificial intelligence language model developed and released by OpenAI (San Francisco, CA) in late 2022. OBJECTIVES The aim of this study was to evaluate the performance of ChatGPT on the Plastic Surgery In-Service Examination and to compare it to residents' performance nationally. METHODS The Plastic Surgery In-Service Examinations from 2018 to 2022 were used as a question source. For each question, the stem and all multiple-choice options were imported into ChatGPT. The 2022 examination was used to compare the performance of ChatGPT to plastic surgery residents nationally. RESULTS In total, 1129 questions were included in the final analysis and ChatGPT answered 630 (55.8%) of these correctly. ChatGPT scored the highest on the 2021 exam (60.1%) and on the comprehensive section (58.7%). There were no significant differences regarding questions answered correctly among exam years or among the different exam sections. ChatGPT answered 57% of questions correctly on the 2022 exam. When compared to the performance of plastic surgery residents in 2022, ChatGPT would rank in the 49th percentile for first-year integrated plastic surgery residents, 13th percentile for second-year residents, 5th percentile for third- and fourth-year residents, and 0th percentile for fifth- and sixth-year residents. CONCLUSIONS ChatGPT performs at the level of a first-year resident on the Plastic Surgery In-Service Examination. However, it performed poorly when compared with residents in more advanced years of training. Although ChatGPT has many undeniable benefits and potential uses in the field of healthcare and medical education, it will require additional research to assess its efficacy.
Collapse
|
85
|
Vitorino LM, Júnior GHY. ChatGPT and the teaching of contemporary nursing: And now professor? J Clin Nurs 2023; 32:7921-7922. [PMID: 37004198 DOI: 10.1111/jocn.16706] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 03/20/2023] [Indexed: 04/03/2023]
|
86
|
Makiev KG, Asimakidou M, Vasios IS, Keskinis A, Petkidis G, Tilkeridis K, Ververidis A, Iliopoulos E. A Study on Distinguishing ChatGPT-Generated and Human-Written Orthopaedic Abstracts by Reviewers: Decoding the Discrepancies. Cureus 2023; 15:e49166. [PMID: 38130535 PMCID: PMC10733892 DOI: 10.7759/cureus.49166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/21/2023] [Indexed: 12/23/2023] Open
Abstract
BACKGROUND ChatGPT (OpenAI Incorporated, Mission District, San Francisco, United States) is an artificial intelligence (AI)-based language model that generates human-resembling texts. This AI-generated literary work is comprehensible and contextually relevant and it is really difficult to differentiate from human-written content. ChatGPT has risen in popularity lately and is widely utilized in scholarly manuscript drafting. The aim of this study is to identify if 1) human reviewers can differentiate between AI-generated and human-written abstracts and 2) AI detectors are currently reliable in detecting AI-generated abstracts. METHODS Seven blinded reviewers were asked to read 21 abstracts and differentiate which were AI-generated and which were human-written. The first group consisted of three orthopaedic residents with limited research experience (OR). The second group included three orthopaedic professors with extensive research experience (OP). The seventh reviewer was a non-orthopaedic doctor and acted as a control in terms of expertise. All abstracts were scanned by a plagiarism detector program. The performance of detecting AI-generated abstracts of two different AI detectors was also analyzed. A structured interview was conducted at the end of the survey in order to evaluate the decision-making process utilized by each reviewer. RESULTS The OR group managed to identify correctly 34.9% of the abstracts' authorship and the OP group 31.7%. The non-orthopaedic control identified correctly 76.2%. All AI-generated abstracts were 100% unique (0% plagiarism). The first AI detector managed to identify correctly only 9/21 (42.9%) of the abstracts' authors, whereas the second AI detector identified 14/21 (66.6%). CONCLUSION Inability to correctly identify AI-generated context poses a significant scientific risk as "false" abstracts can end up in scientific conferences or publications. Neither expertise nor research background was shown to have any meaningful impact on the predictive outcome. Focus on statistical data presentation may help the differentiation process. Further research is warranted in order to highlight which elements could help reveal an AI-generated abstract.
Collapse
Affiliation(s)
- Konstantinos G Makiev
- Department of Orthopaedics, University General Hospital of Alexandroupolis, Democritus University of Thrace, Alexandroupoli, GRC
| | - Maria Asimakidou
- School of Medicine, University General Hospital of Alexandroupolis, Democritus University of Thrace, Alexandroupoli, GRC
| | - Ioannis S Vasios
- Department of Orthopaedics, University General Hospital of Alexandroupolis, Democritus University of Thrace, Alexandroupoli, GRC
| | - Anthimos Keskinis
- Department of Orthopaedics, University General Hospital of Alexandroupolis, Democritus University of Thrace, Alexandroupoli, GRC
| | - Georgios Petkidis
- Department of Orthopaedics, University General Hospital of Alexandroupolis, Democritus University of Thrace, Alexandroupoli, GRC
| | - Konstantinos Tilkeridis
- Department of Orthopaedics, University General Hospital of Alexandroupolis, Democritus University of Thrace, Alexandroupoli, GRC
| | - Athanasios Ververidis
- Department of Orthopaedics, University General Hospital of Alexandroupolis, Democritus University of Thrace, Alexandroupoli, GRC
| | - Efthymios Iliopoulos
- Department of Orthopaedics, University General Hospital of Alexandroupolis, Democritus University of Thrace, Alexandroupoli, GRC
| |
Collapse
|
87
|
Shankar SA. AI-generated potential research paper: overview in cardiac surgery-Is this the future? Indian J Thorac Cardiovasc Surg 2023; 39:651-653. [PMID: 37885927 PMCID: PMC10597974 DOI: 10.1007/s12055-023-01579-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 07/14/2023] [Accepted: 07/17/2023] [Indexed: 10/28/2023] Open
Affiliation(s)
- S Anand Shankar
- Institute of Cardiovascular and Thoracic Surgery, RGGGH, Chennai, 600003 India
| |
Collapse
|
88
|
Peh W, Saw A. Artificial Intelligence: Impact and Challenges to Authors, Journals and Medical Publishing. Malays Orthop J 2023; 17:1-4. [PMID: 38107365 PMCID: PMC10723007 DOI: 10.5704/moj.2311.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Accepted: 09/20/2023] [Indexed: 12/19/2023] Open
Abstract
Artificial intelligence (AI)-assisted technologies are here to stay and cannot be ignored. These tools are able to generate highly-realistic human-like text and perform a wide range of useful language tasks with a wide range of applications. They have the potential to expedite innovation in health care and can aid in promoting equity and diversity in research by overcoming language barriers. When using these AI tools, authors must take responsibility for the output and originality of their work, as publishers expect all content to be generated by human authors unless there is a declaration to the contrary. Authors must disclose how AI tools have been used, and ensure appropriate attribution of all the text, images, and audio-visual material. The responsible use of AI language models and transparent reporting of how these tools were used in the creation of information and publication are vital to promote and protect the credibility and integrity of medical research, and trust in medical knowledge. Educating postgraduate and undergraduate students, researchers and authors on the applications and best usage of AI-assisted technologies, together with the importance of critical thinking, integrity and strict adherence to ethical principles, are key steps that need to be undertaken.
Collapse
Affiliation(s)
- Wcg Peh
- Department of Diagnostic Radiology, Khoo Teck Puat Hospital, Singapore
| | - A Saw
- Department of Orthopaedic Surgery (NOCERAL), University of Malaya, Kuala Lumpur, Malaysia
| |
Collapse
|
89
|
Huang H. Performance of ChatGPT on Registered Nurse License Exam in Taiwan: A Descriptive Study. Healthcare (Basel) 2023; 11:2855. [PMID: 37958000 PMCID: PMC10649156 DOI: 10.3390/healthcare11212855] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2023] [Revised: 10/17/2023] [Accepted: 10/27/2023] [Indexed: 11/15/2023] Open
Abstract
(1) Background: AI (artificial intelligence) chatbots have been widely applied. ChatGPT could enhance individual learning capabilities and clinical reasoning skills and facilitate students' understanding of complex concepts in healthcare education. There is currently less emphasis on its application in nursing education. The application of ChatGPT in nursing education needs to be verified. (2) Methods: A descriptive study was used to analyze the scores of ChatGPT on the registered nurse license exam (RNLE) in 2022~2023, and to explore the response and explanations of ChatGPT. The process of data measurement encompassed input sourcing, encoding methods, and statistical analysis. (3) Results: ChatGPT promptly responded within seconds. The average score of four exams was around 51.6 to 63.75 by ChatGPT, and it passed the RNLE in 2022 1st and 2023 2nd. However, ChatGPT may generate misleading or inaccurate explanations, or it could lead to hallucination; confusion or misunderstanding about complicated scenarios; and languages bias. (4) Conclusions: ChatGPT may have the potential to assist with nursing education because of its advantages. It is recommended to integrate ChatGPT into different nursing courses, to assess its limitations and effectiveness through a variety of tools and methods.
Collapse
Affiliation(s)
- Huiman Huang
- School of Nursing, College of Nursing, Tzu Chi University of Science and Technology, Hualien 970302, Taiwan
| |
Collapse
|
90
|
Draschl A, Hauer G, Fischerauer SF, Kogler A, Leitner L, Andreou D, Leithner A, Sadoghi P. Are ChatGPT's Free-Text Responses on Periprosthetic Joint Infections of the Hip and Knee Reliable and Useful? J Clin Med 2023; 12:6655. [PMID: 37892793 PMCID: PMC10607052 DOI: 10.3390/jcm12206655] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 10/12/2023] [Accepted: 10/17/2023] [Indexed: 10/29/2023] Open
Abstract
BACKGROUND This study aimed to evaluate ChatGPT's performance on questions about periprosthetic joint infections (PJI) of the hip and knee. METHODS Twenty-seven questions from the 2018 International Consensus Meeting on Musculoskeletal Infection were selected for response generation. The free-text responses were evaluated by three orthopedic surgeons using a five-point Likert scale. Inter-rater reliability (IRR) was assessed via Fleiss' kappa (FK). RESULTS Overall, near-perfect IRR was found for disagreement on the presence of factual errors (FK: 0.880, 95% CI [0.724, 1.035], p < 0.001) and agreement on information completeness (FK: 0.848, 95% CI [0.699, 0.996], p < 0.001). Substantial IRR was observed for disagreement on misleading information (FK: 0.743, 95% CI [0.601, 0.886], p < 0.001) and agreement on suitability for patients (FK: 0.627, 95% CI [0.478, 0.776], p < 0.001). Moderate IRR was observed for agreement on "up-to-dateness" (FK: 0.584, 95% CI [0.434, 0.734], p < 0.001) and suitability for orthopedic surgeons (FK: 0.505, 95% CI [0.383, 0.628], p < 0.001). Question- and subtopic-specific analysis revealed diverse IRR levels ranging from near-perfect to poor. CONCLUSIONS ChatGPT's free-text responses to complex orthopedic questions were predominantly reliable and useful for orthopedic surgeons and patients. Given variations in performance by question and subtopic, consulting additional sources and exercising careful interpretation should be emphasized for reliable medical decision-making.
Collapse
Affiliation(s)
- Alexander Draschl
- Department of Orthopedics and Trauma, Medical University of Graz, Auenbruggerplatz 5, 8036 Graz, Austria
- Division of Plastic, Aesthetic and Reconstructive Surgery, Department of Surgery, Medical University of Graz, Auenbruggerplatz 29/4, 8036 Graz, Austria
| | - Georg Hauer
- Department of Orthopedics and Trauma, Medical University of Graz, Auenbruggerplatz 5, 8036 Graz, Austria
| | - Stefan Franz Fischerauer
- Department of Orthopedics and Trauma, Medical University of Graz, Auenbruggerplatz 5, 8036 Graz, Austria
| | - Angelika Kogler
- Department of Orthopedics and Trauma, Medical University of Graz, Auenbruggerplatz 5, 8036 Graz, Austria
- Department of Dermatology and Venereology, Medical University of Graz, Auenbruggerplatz 8, 8036 Graz, Austria
| | - Lukas Leitner
- Department of Orthopedics and Trauma, Medical University of Graz, Auenbruggerplatz 5, 8036 Graz, Austria
| | - Dimosthenis Andreou
- Department of Orthopedics and Trauma, Medical University of Graz, Auenbruggerplatz 5, 8036 Graz, Austria
| | - Andreas Leithner
- Department of Orthopedics and Trauma, Medical University of Graz, Auenbruggerplatz 5, 8036 Graz, Austria
| | - Patrick Sadoghi
- Department of Orthopedics and Trauma, Medical University of Graz, Auenbruggerplatz 5, 8036 Graz, Austria
| |
Collapse
|
91
|
Moons P, Van Bulck L. ChatGPT: can artificial intelligence language models be of value for cardiovascular nurses and allied health professionals. Eur J Cardiovasc Nurs 2023; 22:e55-e59. [PMID: 36752788 DOI: 10.1093/eurjcn/zvad022] [Citation(s) in RCA: 50] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 02/01/2022] [Accepted: 02/02/2023] [Indexed: 02/09/2023]
Affiliation(s)
- Philip Moons
- KU Leuven Department of Public Health and Primary Care, KU Leuven-University of Leuven, Kapucijnenvoer 35, Box 7001, 3000 Leuven, Belgium
- Institute of Health and Care Sciences, University of Gothenburg, Arvid Wallgrens backe 1, 413 46 Gothenburg, Sweden
- Department of Paediatrics and Child Health, University of Cape Town, Klipfontein Rd, Rondebosch, 7700 Cape Town, South Africa
| | - Liesbet Van Bulck
- KU Leuven Department of Public Health and Primary Care, KU Leuven-University of Leuven, Kapucijnenvoer 35, Box 7001, 3000 Leuven, Belgium
- Research Foundation Flanders (FWO), Leuvenseweg 38, 1000 Brussels, Belgium
| |
Collapse
|
92
|
Bassiri-Tehrani B, Cress PE. Unleashing the Power of ChatGPT: Revolutionizing Plastic Surgery and Beyond. Aesthet Surg J 2023; 43:1395-1399. [PMID: 37154803 DOI: 10.1093/asj/sjad135] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 05/01/2023] [Accepted: 05/02/2023] [Indexed: 05/10/2023] Open
|
93
|
Kellner AWA. Artificial Intelligence in scientific publications? AN ACAD BRAS CIENC 2023; 95:e202395S1. [PMID: 37851718 DOI: 10.1590/0001-37652023202395s1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2023] Open
Affiliation(s)
- Alexander W A Kellner
- Universidade Federal do Rio de Janeiro, Museu Nacional, Laboratório de Sistemática e Tafonomia de Vertebrados Fósseis, Departamento de Geologia e Paleontologia, Quinta da Boa Vista, s/n, São Cristóvão, 20940-040 Rio de Janeiro, RJ, Brazil
| |
Collapse
|
94
|
Huh S. Ethical consideration of the use of generative artificial intelligence, including ChatGPT in writing a nursing article. CHILD HEALTH NURSING RESEARCH 2023; 29:249-251. [PMID: 37939670 PMCID: PMC10636529 DOI: 10.4094/chnr.2023.29.4.249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Accepted: 10/24/2023] [Indexed: 11/10/2023] Open
Affiliation(s)
- Sun Huh
- Professor, Department of Parasitology and Institute of Medical Education, College of Medicine, Hallym University, Chuncheon, Korea
| |
Collapse
|
95
|
Hosseini M, Resnik DB, Holmes K. The ethics of disclosing the use of artificial intelligence tools in writing scholarly manuscripts. RESEARCH ETHICS 2023; 19:449-465. [PMID: 39749232 PMCID: PMC11694804 DOI: 10.1177/17470161231180449] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/04/2025]
Abstract
In this article, we discuss ethical issues related to using and disclosing artificial intelligence (AI) tools, such as ChatGPT and other systems based on large language models (LLMs), to write or edit scholarly manuscripts. Some journals, such as Science, have banned the use of LLMs because of the ethical problems they raise concerning responsible authorship. We argue that this is not a reasonable response to the moral conundrums created by the use of LLMs because bans are unenforceable and would encourage undisclosed use of LLMs. Furthermore, LLMs can be useful in writing, reviewing and editing text, and promote equity in science. Others have argued that LLMs should be mentioned in the acknowledgments since they do not meet all the authorship criteria. We argue that naming LLMs as authors or mentioning them in the acknowledgments are both inappropriate forms of recognition because LLMs do not have free will and therefore cannot be held morally or legally responsible for what they do. Tools in general, and software in particular, are usually cited in-text, followed by being mentioned in the references. We provide suggestions to improve APA Style for referencing ChatGPT to specifically indicate the contributor who used LLMs (because interactions are stored on personal user accounts), the used version and model (because the same version could use different language models and generate dissimilar responses, e.g., ChatGPT May 12 Version GPT3.5 or GPT4), and the time of usage (because LLMs evolve fast and generate dissimilar responses over time). We recommend that researchers who use LLMs: (1) disclose their use in the introduction or methods section to transparently describe details such as used prompts and note which parts of the text are affected, (2) use in-text citations and references (to recognize their used applications and improve findability and indexing), and (3) record and submit their relevant interactions with LLMs as supplementary material or appendices.
Collapse
Affiliation(s)
| | - David B Resnik
- National Institute of Environmental Health Sciences, USA
| | - Kristi Holmes
- Northwestern University Feinberg School of Medicine, USA
| |
Collapse
|
96
|
Jeyaraman M, Ramasubramanian S, Balaji S, Jeyaraman N, Nallakumarasamy A, Sharma S. ChatGPT in action: Harnessing artificial intelligence potential and addressing ethical challenges in medicine, education, and scientific research. World J Methodol 2023; 13:170-178. [PMID: 37771867 PMCID: PMC10523250 DOI: 10.5662/wjm.v13.i4.170] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 06/29/2023] [Accepted: 07/24/2023] [Indexed: 09/20/2023] Open
Abstract
Artificial intelligence (AI) tools, like OpenAI's Chat Generative Pre-trained Transformer (ChatGPT), hold considerable potential in healthcare, academia, and diverse industries. Evidence demonstrates its capability at a medical student level in standardized tests, suggesting utility in medical education, radiology reporting, genetics research, data optimization, and drafting repetitive texts such as discharge summaries. Nevertheless, these tools should augment, not supplant, human expertise. Despite promising applications, ChatGPT confronts limitations, including critical thinking tasks and generating false references, necessitating stringent cross-verification. Ensuing concerns, such as potential misuse, bias, blind trust, and privacy, underscore the need for transparency, accountability, and clear policies. Evaluations of AI-generated content and preservation of academic integrity are critical. With responsible use, AI can significantly improve healthcare, academia, and industry without compromising integrity and research quality. For effective and ethical AI deployment, collaboration amongst AI developers, researchers, educators, and policymakers is vital. The development of domain-specific tools, guidelines, regulations, and the facilitation of public dialogue must underpin these endeavors to responsibly harness AI's potential.
Collapse
Affiliation(s)
- Madhan Jeyaraman
- Department of Orthopaedics, ACS Medical College and Hospital, Dr MGR Educational and Research Institute, Chennai 600077, Tamil Nadu, India
| | - Swaminathan Ramasubramanian
- Department of General Medicine, Government Medical College, Omandurar Government Estate, Chennai 600018, Tamil Nadu, India
| | - Sangeetha Balaji
- Department of General Medicine, Government Medical College, Omandurar Government Estate, Chennai 600018, Tamil Nadu, India
| | - Naveen Jeyaraman
- Department of Orthopaedics, ACS Medical College and Hospital, Dr MGR Educational and Research Institute, Chennai 600077, Tamil Nadu, India
| | - Arulkumar Nallakumarasamy
- Department of Orthopaedics, ACS Medical College and Hospital, Dr MGR Educational and Research Institute, Chennai 600077, Tamil Nadu, India
| | - Shilpa Sharma
- Department of Paediatric Surgery, All India Institute of Medical Sciences, Delhi 110029, New Delhi, India
| |
Collapse
|
97
|
Evaluating the Efficacy of ChatGPT as a Valuable Resource for Pharmacology Studies in Traditional and Complementary Medicine (T&CM) Education. ADVANCES IN EDUCATIONAL TECHNOLOGIES AND INSTRUCTIONAL DESIGN 2023:1-17. [DOI: 10.4018/978-1-6684-9300-7.ch001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Artificial intelligence (AI) is gaining increasing prominence in the field of education, yet comprehensive investigations into its underlying patterns, research limitations, and potential applications remain scarce. ChatGPT, an AI-powered platform developed by the AI research and deployment company OpenAI, allows users to input text instructions and receive prompt textual responses based on its machine learning-driven interactions with online information sources. This study aims to assess the efficacy of ChatGPT in addressing student-centered medical inquiries pertaining to pharmacology, thereby examining its relevance as a self-study resource to enhance the learning experiences of students. Specifically, the study encompasses various domains of pharmacology, such as pharmacokinetics, mechanism of action, clinical uses, adverse effects, contraindications, and drug-drug interactions. The findings demonstrate that ChatGPT provides pertinent and accurate answers to these questions.
Collapse
|
98
|
Brameier DT, Alnasser AA, Carnino JM, Bhashyam AR, von Keudell AG, Weaver MJ. Artificial Intelligence in Orthopaedic Surgery: Can a Large Language Model "Write" a Believable Orthopaedic Journal Article? J Bone Joint Surg Am 2023; 105:1388-1392. [PMID: 37437021 DOI: 10.2106/jbjs.23.00473] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 07/14/2023]
Abstract
ABSTRACT ➢ Natural language processing with large language models is a subdivision of artificial intelligence (AI) that extracts meaning from text with use of linguistic rules, statistics, and machine learning to generate appropriate text responses. Its utilization in medicine and in the field of orthopaedic surgery is rapidly growing.➢ Large language models can be utilized in generating scientific manuscript texts of a publishable quality; however, they suffer from AI hallucinations, in which untruths or half-truths are stated with misleading confidence. Their use raises considerable concerns regarding the potential for research misconduct and for hallucinations to insert misinformation into the clinical literature.➢ Current editorial processes are insufficient for identifying the involvement of large language models in manuscripts. Academic publishing must adapt to encourage safe use of these tools by establishing clear guidelines for their use, which should be adopted across the orthopaedic literature, and by implementing additional steps in the editorial screening process to identify the use of these tools in submitted manuscripts.
Collapse
Affiliation(s)
- Devon T Brameier
- Department of Orthopaedic Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts
| | - Ahmad A Alnasser
- Department of Orthopaedic Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts
| | - Jonathan M Carnino
- Boston University Chobanian & Avedisian School of Medicine, Boston, Massachusetts
| | - Abhiram R Bhashyam
- Department of Orthopaedic Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts
| | - Arvind G von Keudell
- Department of Orthopaedic Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts
- Bispebjerg Hospital, University of Copenhagen, Copenhagen, Denmark
| | - Michael J Weaver
- Department of Orthopaedic Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
99
|
Hua HU, Kaakour AH, Rachitskaya A, Srivastava S, Sharma S, Mammo DA. Evaluation and Comparison of Ophthalmic Scientific Abstracts and References by Current Artificial Intelligence Chatbots. JAMA Ophthalmol 2023; 141:819-824. [PMID: 37498609 PMCID: PMC10375387 DOI: 10.1001/jamaophthalmol.2023.3119] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Accepted: 05/26/2023] [Indexed: 07/28/2023]
Abstract
Importance Language-learning model-based artificial intelligence (AI) chatbots are growing in popularity and have significant implications for both patient education and academia. Drawbacks of using AI chatbots in generating scientific abstracts and reference lists, including inaccurate content coming from hallucinations (ie, AI-generated output that deviates from its training data), have not been fully explored. Objective To evaluate and compare the quality of ophthalmic scientific abstracts and references generated by earlier and updated versions of a popular AI chatbot. Design, Setting, and Participants This cross-sectional comparative study used 2 versions of an AI chatbot to generate scientific abstracts and 10 references for clinical research questions across 7 ophthalmology subspecialties. The abstracts were graded by 2 authors using modified DISCERN criteria and performance evaluation scores. Main Outcome and Measures Scores for the chatbot-generated abstracts were compared using the t test. Abstracts were also evaluated by 2 AI output detectors. A hallucination rate for unverifiable references generated by the earlier and updated versions of the chatbot was calculated and compared. Results The mean modified AI-DISCERN scores for the chatbot-generated abstracts were 35.9 and 38.1 (maximum of 50) for the earlier and updated versions, respectively (P = .30). Using the 2 AI output detectors, the mean fake scores (with a score of 100% meaning generated by AI) for the earlier and updated chatbot-generated abstracts were 65.4% and 10.8%, respectively (P = .01), for one detector and were 69.5% and 42.7% (P = .17) for the second detector. The mean hallucination rates for nonverifiable references generated by the earlier and updated versions were 33% and 29% (P = .74). Conclusions and Relevance Both versions of the chatbot generated average-quality abstracts. There was a high hallucination rate of generating fake references, and caution should be used when using these AI resources for health education or academic purposes.
Collapse
Affiliation(s)
- Hong-Uyen Hua
- Cole Eye Institute, Cleveland Clinic Foundation, Cleveland, Ohio
| | | | | | - Sunil Srivastava
- Cole Eye Institute, Cleveland Clinic Foundation, Cleveland, Ohio
| | - Sumit Sharma
- Cole Eye Institute, Cleveland Clinic Foundation, Cleveland, Ohio
| | - Danny A. Mammo
- Cole Eye Institute, Cleveland Clinic Foundation, Cleveland, Ohio
| |
Collapse
|
100
|
Panthier C, Gatinel D. Success of ChatGPT, an AI language model, in taking the French language version of the European Board of Ophthalmology examination: A novel approach to medical knowledge assessment. J Fr Ophtalmol 2023; 46:706-711. [PMID: 37537126 DOI: 10.1016/j.jfo.2023.05.006] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 05/27/2023] [Accepted: 05/31/2023] [Indexed: 08/05/2023]
Abstract
PURPOSE The purpose of this study was to evaluate the performance of ChatGPT, a cutting-edge artificial intelligence (AI) language model developed by OpenAI, in successfully completing the French language version of the European Board of Ophthalmology (EBO) examination and to assess its potential role in medical education and knowledge assessment. METHODS ChatGPT, based on the GPT-4 architecture, was exposed to a series of EBO examination questions in French, covering various aspects of ophthalmology. The AI's performance was evaluated by comparing its responses with the correct answers provided by ophthalmology experts. Additionally, the study assessed the time taken by ChatGPT to answer each question as a measure of efficiency. RESULTS ChatGPT achieved a 91% success rate on the EBO examination, demonstrating a high level of competency in ophthalmology knowledge and application. The AI provided correct answers across all question categories, indicating a strong understanding of basic sciences, clinical knowledge, and clinical management. The AI model also answered the questions rapidly, taking only a fraction of the time needed by human test-takers. CONCLUSION ChatGPT's performance on the French language version of the EBO examination demonstrates its potential to be a valuable tool in medical education and knowledge assessment. Further research is needed to explore optimal ways to implement AI language models in medical education and to address the associated ethical and practical concerns.
Collapse
Affiliation(s)
- C Panthier
- Department of Ophthalmology, Rothschild Foundation Hospital, 25, rue Manin, 75019 Paris, France; Center of Expertise and Research in Optics for Vision (CEROV), Paris, France
| | - D Gatinel
- Department of Ophthalmology, Rothschild Foundation Hospital, 25, rue Manin, 75019 Paris, France; Center of Expertise and Research in Optics for Vision (CEROV), Paris, France.
| |
Collapse
|