1
|
Ozdag Y, Mahmoud M, Klena JC, Grandizio LC. Artificial Intelligence in Personal Statements Within Orthopaedic Surgery Residency Applications. J Am Acad Orthop Surg 2025; 33:554-560. [PMID: 40101179 DOI: 10.5435/jaaos-d-24-01285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/03/2024] [Accepted: 01/28/2025] [Indexed: 03/20/2025] Open
Abstract
PURPOSE Artificial intelligence (AI) has been increasingly studied within medical education and clinical practice. At present, it remains uncertain if AI is being used to write personal statements (PSs) for orthopaedic surgery residency applications. Our purpose was to analyze PS that were submitted to our institution and determine the rate of AI utilization within these texts. METHODS Four groups were created for comparison: 100 PS submitted before the release of ChatGTP (PRE-PS), 100 PS submitted after Chat Generative Pre-Trained Transformers introduction (POST-PS), 10 AI-generated PS (AI-PS), and 10 hybrid PS (H-PS), which contained both human-generated and AI-generated text. For each of the four groups, AI detection software (GPT-Zero) was used to quantify the percentage of human-generated text, "mixed" text, and AI-generated text. In addition, the detection software provided level of confidence (highly confident, moderately confident, uncertain) with respect to the "final verdict" of human-generated versus AI-generated text. RESULTS The percentage of human-generated text in the PRE-PS, POST-PS, H-PS, and AI-PS groups were 94%, 93%, 28%, and 0% respectively. All 200 PS (100%) submitted to our program had a final verdict of "human" with verdict confidence of >90%. By contrast, all AI-generated statements (H-PS and AI-PS groups) had a final verdict of "AI." Verdict confidence for the AI-PS group was 100%. CONCLUSION Orthopaedic surgery residency applicants do not appear, at present, to be using AI to create PS included in their applications. AI detection software (GPTZero) appears to be able to accurately detect human-generated and AI-generated PSs for orthopaedic residency applications. Considering the increasing role and development of AI software, future investigations should endeavor to explore if these results change over time. Similar to orthopaedic journals, guidelines should be established that pertain to the use of AI on postgraduate training applications. LEVEL OF EVIDENCE V-Nonclinical.
Collapse
Affiliation(s)
- Yagiz Ozdag
- From the Department of Orthopaedic Surgery, Geisinger Commonwealth School of Medicine, Geisinger Musculoskeletal Institute, Danville, PA
| | | | | | | |
Collapse
|
2
|
Lenert LA. How the National Library of Medicine should evolve in an era of artificial intelligence. J Am Med Inform Assoc 2025; 32:968-970. [PMID: 40063704 PMCID: PMC12012362 DOI: 10.1093/jamia/ocaf041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Revised: 02/12/2025] [Accepted: 02/24/2025] [Indexed: 04/23/2025] Open
Abstract
OBJECTIVES This article describes the challenges faced by the National Library of Medicine with the rise of artificial intelligence (AI) and access to human knowledge through large language models (LLMs). BACKGROUND AND SIGNIFICANCE The rise of AI as a tool for the acceleration and falsification of science is impacting every aspect of the transformation of data to information, knowledge, and wisdom through the scientific processes. APPROACH This perspective discusses the philosophical foundations, threats, and opportunities of the AI revolution with a proposal for restructuring the mission of the National Library of Medicine (NLM), part of the National Institutes of Health, with a central role as the guardian of the integrity of scientific knowledge in an era of AI-driven science. RESULTS The NLM can rise to new challenges posed by AI by working from its foundations in theories of Information Science and embracing new roles. Three paths for the NLM are proposed: (1) Become an Authentication Authority For Data, Information, and Knowledge through Systems of Scientific Provenance; (2) Become An Observatory of the State of Human Health Science supporting living systematic reviews; and (3) Become A hub for Culturally Appropriate Bespoke Translation, Transformation, and Summarization for different users (patients, the public, as well as scientists and clinicians) using AI technologies. DISCUSSION Adapting the NLM to the challenges of the Internet revolution by developing worldwide-web-accessible resources allowed the NLM to rise to new heights. Bold moves are needed to adapt the Library to the AI revolution but offer similar prospects of more significant impacts on the advancement of science and human health.
Collapse
Affiliation(s)
- Leslie Andrew Lenert
- Biomedical Informatics Center, Medical University of South Carolina, Charleston, SC 29405, United States
| |
Collapse
|
3
|
Liang S, Zhang J, Liu X, Huang Y, Shao J, Liu X, Li W, Wang G, Wang C. The potential of large language models to advance precision oncology. EBioMedicine 2025; 115:105695. [PMID: 40305985 DOI: 10.1016/j.ebiom.2025.105695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2024] [Revised: 03/19/2025] [Accepted: 03/27/2025] [Indexed: 05/02/2025] Open
Abstract
With the rapid development of artificial intelligence (AI) within medicine, the emergence of large language models (LLMs) has gradually reached the forefront of clinical research. In oncology, by mining the underlying connection between a text or image input and the desired output, LLMs demonstrate great potential for managing tumours. In this review, we provide a brief description of the development of LLMs, followed by model construction strategies and general medical functions. We then elaborate on the role of LLMs in cancer screening and diagnosis, metastasis identification, tumour staging, treatment recommendation, and documentation processing tasks by decoding various types of clinical data. Moreover, the current barriers faced by LLMs, such as hallucinations, ethical problems, limited application, and so on, are outlined along with corresponding solutions, where the further purpose is to inspire improvement and innovation in this field with respect to harnessing LLMs for advancing precision oncology.
Collapse
Affiliation(s)
- Shufan Liang
- Department of Pulmonary and Critical Care Medicine, State Key Laboratory of Respiratory Health and Multimorbidity, Targeted Tracer Research and Development Laboratory, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, West China School of Medicine, Sichuan University, Chengdu, China
| | - Jiangjiang Zhang
- State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China
| | - Xingting Liu
- Department of Pulmonary and Critical Care Medicine, State Key Laboratory of Respiratory Health and Multimorbidity, Targeted Tracer Research and Development Laboratory, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, West China School of Medicine, Sichuan University, Chengdu, China
| | - Yinkui Huang
- State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China
| | - Jun Shao
- Department of Pulmonary and Critical Care Medicine, State Key Laboratory of Respiratory Health and Multimorbidity, Targeted Tracer Research and Development Laboratory, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, West China School of Medicine, Sichuan University, Chengdu, China
| | - Xiaohong Liu
- State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China; UCL Cancer Institute, University College London, London WC1E 6BT, UK
| | - Weimin Li
- Department of Pulmonary and Critical Care Medicine, State Key Laboratory of Respiratory Health and Multimorbidity, Targeted Tracer Research and Development Laboratory, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, West China School of Medicine, Sichuan University, Chengdu, China; Frontiers Medical Center, Tianfu Jincheng Laboratory, Chengdu, China.
| | - Guangyu Wang
- State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China.
| | - Chengdi Wang
- Department of Pulmonary and Critical Care Medicine, State Key Laboratory of Respiratory Health and Multimorbidity, Targeted Tracer Research and Development Laboratory, Frontiers Science Center for Disease-Related Molecular Network, West China Hospital, West China School of Medicine, Sichuan University, Chengdu, China; Frontiers Medical Center, Tianfu Jincheng Laboratory, Chengdu, China.
| |
Collapse
|
4
|
Soddu M, De Vito A, Madeddu G, Nicolosi B, Provenzano M, Ivziku D, Curcio F. Assessing the Accuracy, Completeness and Safety of ChatGPT-4o Responses on Pressure Injuries in Infants: Clinical Applications and Future Implications. NURSING REPORTS 2025; 15:130. [PMID: 40333050 PMCID: PMC12029477 DOI: 10.3390/nursrep15040130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2025] [Revised: 04/10/2025] [Accepted: 04/11/2025] [Indexed: 05/09/2025] Open
Abstract
Background/Objectives: The advent of large language models (LLMs), like platforms such as ChatGPT, capable of generating quick and interactive answers to complex questions, opens the way for new approaches to training healthcare professionals, enabling them to acquire up-to-date and specialised information easily. In nursing, they have proven to support clinical decision making, continuing education, the development of care plans and the management of complex clinical cases, as well as the writing of academic reports and scientific articles. Furthermore, the ability to provide rapid access to up-to-date scientific information can improve the quality of care and promote evidence-based practice. However, their applicability in clinical practice requires thorough evaluation. This study evaluated the accuracy, completeness and safety of the responses generated by ChatGPT-4 on pressure injuries (PIs) in infants. Methods: In January 2025, we analysed the responses generated by ChatGPT-4 to 60 queries, subdivided into 12 main topics, on PIs in infants. The questions were developed, through consultation of authoritative documents, based on their relevance to nursing care and clinical potential. A panel of five experts, using a 5-point Likert scale, assessed the accuracy, completeness and safety of the answers generated by ChatGPT. Results: Overall, over 90% of the responses generated by ChatGPT-4o received relatively high ratings for the three criteria assessed with the most frequent value of 4. However, when analysing the 12 topics individually, we observed that Medical Device Management and Technological Innovation were the topics with the lowest accuracy scores. At the same time, Scientific Evidence and Technological Innovation had the lowest completeness scores. No answers for the three criteria analysed were rated as completely incorrect. Conclusions: ChatGPT-4 has shown a good level of accuracy, completeness and safety in addressing questions about pressure injuries in infants. However, ongoing updates and integration of high-quality scientific sources are essential for ensuring its reliability as a clinical decision-support tool.
Collapse
Affiliation(s)
- Marica Soddu
- University Hospital of Sassari, Viale San Pietro 10, 07100 Sassari, Italy;
| | - Andrea De Vito
- Department of Medicine, Surgery, and Pharmacy, University of Sassari, 07100 Sassari, Italy; (A.D.V.); (G.M.)
| | - Giordano Madeddu
- Department of Medicine, Surgery, and Pharmacy, University of Sassari, 07100 Sassari, Italy; (A.D.V.); (G.M.)
| | - Biagio Nicolosi
- Department of Health Professions, AOU Meyer IRCCS, 50139 Florence, Italy;
| | - Maria Provenzano
- Unit of General Surgery, Santissima Trinità Hospital, 09121 Cagliari, Italy
| | - Dhurata Ivziku
- Department of Health Professions, Fondazione Policlinico Universitario Campus Bio-Medico, 00128 Rome, Italy;
| | - Felice Curcio
- Department of Medicine, Surgery, and Pharmacy, University of Sassari, 07100 Sassari, Italy; (A.D.V.); (G.M.)
- Faculty of Medicine and Surgery, University of Sassari (UNISS), 07100 Sassari, Italy
| |
Collapse
|
5
|
Sadowsky SJ. Can ChatGPT be trusted as a resource for a scholarly article on treatment planning implant-supported prostheses? J Prosthet Dent 2025:S0022-3913(25)00258-6. [PMID: 40210509 DOI: 10.1016/j.prosdent.2025.03.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2025] [Revised: 03/17/2025] [Accepted: 03/18/2025] [Indexed: 04/12/2025]
Abstract
STATEMENT OF PROBLEM Access to artificial intelligence is ubiquitous but its limitations in the preparation of scholarly articles have not been established in implant restorative treatment planning. PURPOSE The purpose of this study is to determine if ChatGPT can be a reliable resource in synthesizing the best available literature on treatment planning questions for implant-supported prostheses. MATERIAL AND METHODS Six questions were posed to ChatGPT on treatment planning implant-supported prostheses for the partially edentulous and completely edentulous scenario. Question 1: Would higher crown to implant ratios (C/I) greater than 1:1 be linked to increased marginal bone loss? Question 2: Do 2-unit posterior cantilevers lead to more bone loss than 2 adjacent implants? Question 3: Should implants be splinted in the posterior maxilla in patients that require no grafting and are not bruxers? Question 4: Do patients prefer a maxillary implant overdenture to a well-made complete denture? Question 5: Do resilient and rigid anchorage systems have the same maintenance when comparing implant overdentures? Question 6: Do denture patients prefer fixed implant prostheses compared with removable implant prostheses? Follow-up questions were intended to clarify the source and content of the supporting evidence for ChatGPT's responses. Additional higher quality and timely studies indexed on PubMed were identified for ChatGPT to consider in a revision of its original implant treatment planning answer. A quantitative rating was assessed based on 4 indices: accurate/retrievable source, representative literature, accurate interpretation of evidence, original conclusion reflects best evidence. RESULTS ChatGPT's responses to: Question 1: "Higher C/I can be associated with an increased risk of marginal bone loss." Revision: "While many clinicians believe that higher C/I ratios lead to bone loss, recent evidence suggests that this concern is less relevant for modern implants." Question 2: "The presence of cantilever extensions with short implants tend to fail at earlier time points and has been associated with a higher incidence of technical complications. Revision: "The use of implant-supported single-unit crowns with cantilever extensions in posterior regions is a viable long-term treatment option with minimal complications." Question 3: "Splinted restorations were associated with a higher implant survival rate, particularly in the posterior region." Revision: "There is no compelling evidence to suggest that splinting all implants in the posterior maxilla is necessary." Question 4: Patients report higher satisfaction with maxillary implant-supported overdentures compared to conventional complete dentures. Revision: "For patients with adequate maxillary bone support, a conventional denture may be just as satisfactory as an implant overdenture." Question 5: "While resilient attachments may require more frequent replacement of components, rigid attachments might necessitate monitoring for implant-related complications due to increased stress." Revision: "Research indicates that rigid attachment systems, such as bar and telescopic attachments, do not necessarily lead to increased complications due to stress in implant overdentures." Question 6: "Yes, in general, denture patients tend to prefer fixed implant prostheses over removable implant prostheses due to several key advantages. However, preferences can vary based on individual needs, costs, and clinical factors." Revision: "There is no universal patient preference for fixed or removable implant prostheses. Satisfaction is generally high with both options, and preference depends on individual patient factors, including comfort, hygiene, cost, and anatomical considerations." CONCLUSIONS ChatGPT has not demonstrated the ability to accurately cull the literature, stratify the rigor of the evidence, and extract accurate implications from the studies selected to deliver the best evidence-based answers to questions on treatment planning implant-supported prostheses.
Collapse
Affiliation(s)
- Steven J Sadowsky
- Professor Emeritus, Preventive and Restorative Department, University of the Pacific Arthur A. Dugoni School of Dentistry, San Francisco, Calif.
| |
Collapse
|
6
|
Al-Rawas M, Qader OAJA, Othman NH, Ismail NH, Mamat R, Halim MS, Abdullah JY, Noorani TY. Identification of dental related ChatGPT generated abstracts by senior and young academicians versus artificial intelligence detectors and a similarity detector. Sci Rep 2025; 15:11275. [PMID: 40175423 PMCID: PMC11965432 DOI: 10.1038/s41598-025-95387-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2024] [Accepted: 03/20/2025] [Indexed: 04/04/2025] Open
Abstract
Several researchers have investigated the consequences of using ChatGPT in the education industry. Their findings raised doubts regarding the probable effects that ChatGPT may have on the academia. As such, the present study aimed to assess the ability of three methods, namely: (1) academicians (senior and young), (2) three AI detectors (GPT-2 output detector, Writefull GPT detector, and GPTZero) and (3) one plagiarism detector, to differentiate between human- and ChatGPT-written abstracts. A total of 160 abstracts were assessed by those three methods. Two senior and two young academicians used a newly developed rubric to assess the type and quality of 80 human-written and 80 ChatGPT-written abstracts. The results were statistically analysed using crosstabulation and chi-square analysis. Bivariate correlation and accuracy of the methods were assessed. The findings demonstrated that all the three methods made a different variety of incorrect assumptions. The level of the academician experience may play a role in the detection ability with senior academician 1 demonstrating superior accuracy. GPTZero AI and similarity detectors were very good at accurately identifying the abstracts origin. In terms of abstract type, every variable positively correlated, except in the case of similarity detectors (p < 0.05). Human-AI collaborations may significantly benefit the identification of the abstract origins.
Collapse
Affiliation(s)
- Matheel Al-Rawas
- Prosthodontic Unit, School of Dental Sciences, Universiti Sains Malaysia, Health Campus, Kubang Kerian, Kota Bharu, Kelantan, Malaysia
- Hospital Pakar Universiti Sains Malaysia, Kubang Kerian, Kota Bharu, Kelantan, Malaysia
| | | | - Nurul Hanim Othman
- Prosthodontic Unit, School of Dental Sciences, Universiti Sains Malaysia, Health Campus, Kubang Kerian, Kota Bharu, Kelantan, Malaysia
- Hospital Pakar Universiti Sains Malaysia, Kubang Kerian, Kota Bharu, Kelantan, Malaysia
| | - Noor Huda Ismail
- Prosthodontic Unit, School of Dental Sciences, Universiti Sains Malaysia, Health Campus, Kubang Kerian, Kota Bharu, Kelantan, Malaysia
- Hospital Pakar Universiti Sains Malaysia, Kubang Kerian, Kota Bharu, Kelantan, Malaysia
| | - Rosnani Mamat
- Hospital Pakar Universiti Sains Malaysia, Kubang Kerian, Kota Bharu, Kelantan, Malaysia
- Conservative Dentistry Unit, School of Dental Sciences, Universiti Sains Malaysia, Health Campus, Kubang Kerian, Kota Bharu, Kelantan, Malaysia
| | - Mohamad Syahrizal Halim
- Hospital Pakar Universiti Sains Malaysia, Kubang Kerian, Kota Bharu, Kelantan, Malaysia
- Conservative Dentistry Unit, School of Dental Sciences, Universiti Sains Malaysia, Health Campus, Kubang Kerian, Kota Bharu, Kelantan, Malaysia
| | - Johari Yap Abdullah
- Craniofacial Imaging Laboratory, School of Dental Sciences, Universiti Sains Malaysia, Health Campus, 16150 Kubang Kerian, Kota Bharu, Kelantan, Malaysia.
- Dental Research Unit, Center for Transdisciplinary Research (CFTR), Saveetha Dental College, Saveetha Institute of Medical and Technical Sciences (SIMATS), Saveetha University, Chennai, Tamil Nadu, India.
| | - Tahir Yusuf Noorani
- Hospital Pakar Universiti Sains Malaysia, Kubang Kerian, Kota Bharu, Kelantan, Malaysia.
- Conservative Dentistry Unit, School of Dental Sciences, Universiti Sains Malaysia, Health Campus, Kubang Kerian, Kota Bharu, Kelantan, Malaysia.
- Saveetha Dental College and Hospitals, Saveetha Institute of Medical and Technical Sciences (SIMATS), Saveetha University, Chennai, Tamil Nadu, India.
| |
Collapse
|
7
|
El Zoghbi M, Malhotra A, Bilal M, Shaukat A. Impact of Artificial Intelligence on Clinical Research. Gastrointest Endosc Clin N Am 2025; 35:445-455. [PMID: 40021240 DOI: 10.1016/j.giec.2024.10.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/03/2025]
Abstract
Artificial intelligence (AI) has potential to significantly impact clinical research when it comes to research preparation and data interpretation. Development of AI tools that can help in performing literature searches, synthesizing and streamlining data collection and analysis, and formatting of study could make the clinical research process more efficient. Several of these tools have been developed and trialed and many more are being rapidly developed. This article highlights the AI applications in clinical research in gastroenterology including its impact on drug discovery and explores areas where further guidance is needed to supplement the current understanding and enhance its use.
Collapse
Affiliation(s)
- Maysaa El Zoghbi
- Department of Medicine, NYU Grossman School of Medicine, New York, NY, USA
| | - Ashish Malhotra
- Department of Medicine, NYU Grossman School of Medicine, New York, NY, USA
| | - Mohammad Bilal
- University of Minnesota, Minneapolis VA Medical Center, Minneapolis, MN, USA
| | - Aasma Shaukat
- Department of Medicine, NYU Grossman School of Medicine, New York, NY, USA.
| |
Collapse
|
8
|
Dashti M, Londono J, Ghasemi S, Moghaddasi N. How much can we rely on artificial intelligence chatbots such as the ChatGPT software program to assist with scientific writing? J Prosthet Dent 2025; 133:1082-1088. [PMID: 37438164 DOI: 10.1016/j.prosdent.2023.05.023] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 05/22/2023] [Accepted: 05/24/2023] [Indexed: 07/14/2023]
Abstract
STATEMENT OF PROBLEM: Use of the ChatGPT software program by authors raises many questions, primarily regarding egregious issues such as plagiarism. Nevertheless, little is known about the extent to which artificial intelligence (AI) models can produce high-quality research publications and advance and shape the direction of a research topic. PURPOSE The purpose of this study was to determine how well the ChatGPT software program, a writing tool powered by AI, could respond to questions about scientific or research writing and generate accurate references with academic examples. MATERIAL AND METHODS Questions were made for the ChatGPT software program to locate an abstract containing a particular keyword in the Journal of Prosthetic Dentistry (JPD). Then, whether the resulting articles existed or were published was determined. Questions were made for the algorithm 5 times to locate 5 JPD articles containing 2 specific keywords, bringing the total number of articles to 25. The process was repeated twice, each time with a different set of keywords, and the ChatGPT software program provided a total of 75 articles. The search was conducted at various times between April 1 and 4, 2023. Finally, 2 authors independently searched the JPD website and Google Scholar to determine whether the articles provided by the ChatGPT software program existed. RESULTS When the author tested the ChatGPT software program's ability to locate articles in the JPD and Google Scholar using a set of keywords, the results did not match the papers that the ChatGPT software program had generated with the help of the AI tool. Consequently, all 75 articles provided by the ChatGPT software program were not accurately located in the JPD or Google Scholar databases and had to be added manually to ensure the accuracy of the relevant references. CONCLUSIONS Researchers and academic scholars must be cautious when using the ChatGPT software program because AI-generated content cannot provide or analyze the same information as an author or researcher. In addition, the results indicated that writing credit or references to such content or references in prestigious academic journals is not yet appropriate. At this time, scientific writing is only valid when performed manually by researchers.
Collapse
Affiliation(s)
- Mahmood Dashti
- Researcher, School of Dentistry, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| | - Jimmy Londono
- Professor and Director of the Prosthodontics Residency Program and the Ronald Goldstein Center for Esthetics and Implant Dentistry, The Dental College of Georgia at Augusta University, Augusta, Ga
| | - Shohreh Ghasemi
- Adjunct Assistant Professor, Department of Oral and Maxillofacial Surgery, The Dental College of Georgia at Augusta University, Augusta, Ga
| | - Negar Moghaddasi
- Researcher, College of Dental Medicine, Western University of Health Sciences, Pomona, Calif
| |
Collapse
|
9
|
Hasan SS, Fury MS, Woo JJ, Kunze KN, Ramkumar PN. Ethical Application of Generative Artificial Intelligence in Medicine. Arthroscopy 2025; 41:874-885. [PMID: 39689842 DOI: 10.1016/j.arthro.2024.12.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/09/2024] [Revised: 11/25/2024] [Accepted: 12/03/2024] [Indexed: 12/19/2024]
Abstract
Generative artificial intelligence (AI) may revolutionize health care, providing solutions that range from enhancing diagnostic accuracy to personalizing treatment plans. However, its rapid and largely unregulated integration into medicine raises ethical concerns related to data integrity, patient safety, and appropriate oversight. One of the primary ethical challenges lies in generative AI's potential to produce misleading or fabricated information, posing risks of misdiagnosis or inappropriate treatment recommendations, which underscore the necessity for robust physician oversight. Transparency also remains a critical concern, as the closed-source nature of many large-language models prevents both patients and health care providers from understanding the reasoning behind AI-generated outputs, potentially eroding trust. The lack of regulatory approval for AI as a medical device, combined with concerns around the security of patient-derived data and AI-generated synthetic data, further complicates its safe integration into clinical workflows. Furthermore, synthetic datasets generated by AI, although valuable for augmenting research in areas with scarce data, complicate questions of data ownership, patient consent, and scientific validity. In addition, generative AI's ability to streamline administrative tasks risks depersonalizing care, further distancing providers from patients. These challenges compound the deeper issues plaguing the health care system, including the emphasis of volume and speed over value and expertise. The use of generative AI in medicine brings about mass scaling of synthetic information, thereby necessitating careful adoption to protect patient care and medical advancement. Given these considerations, generative AI applications warrant regulatory and critical scrutiny. Key starting points include establishing strict standards for data security and transparency, implementing oversight akin to institutional review boards to govern data usage, and developing interdisciplinary guidelines that involve developers, clinicians, and ethicists. By addressing these concerns, we can better align generative AI adoption with the core foundations of humanistic health care, preserving patient safety, autonomy, and trust while harnessing AI's transformative potential. LEVEL OF EVIDENCE: Level V, expert opinion.
Collapse
Affiliation(s)
| | - Matthew S Fury
- Baton Rouge Orthopaedic Clinic, Baton Rouge, Louisiana, U.S.A
| | - Joshua J Woo
- Brown University/The Warren Alpert School of Brown University, Providence, Rhode Island, U.S.A
| | - Kyle N Kunze
- Hospital for Special Surgery, New York, New York, U.S.A
| | | |
Collapse
|
10
|
Raman R. Transparency in research: An analysis of ChatGPT usage acknowledgment by authors across disciplines and geographies. Account Res 2025; 32:277-298. [PMID: 37877216 DOI: 10.1080/08989621.2023.2273377] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 10/17/2023] [Indexed: 10/26/2023]
Abstract
This investigation systematically reviews the recognition of generative AI tools, particularly ChatGPT, in scholarly literature. Utilizing 1,226 publications from the Dimensions database, ranging from November 2022 to July 2023, the research scrutinizes temporal trends and distribution across disciplines and regions. U.S.-based authors lead in acknowledgments, with notable contributions from China and India. Predominantly, Biomedical and Clinical Sciences, as well as Information and Computing Sciences, are engaging with these AI tools. Publications like "The Lancet Digital Health" and platforms such as "bioRxiv" are recurrent venues for such acknowledgments, highlighting AI's growing impact on research dissemination. The analysis is confined to the Dimensions database, thus potentially overlooking other sources and grey literature. Additionally, the study abstains from examining the acknowledgments' quality or ethical considerations. Findings are beneficial for stakeholders, providing a basis for policy and scholarly discourse on ethical AI use in academia. This study represents the inaugural comprehensive empirical assessment of AI acknowledgment patterns in academic contexts, addressing a previously unexplored aspect of scholarly communication.
Collapse
Affiliation(s)
- Raghu Raman
- Amrita School of Business, Amrita Vishwa Vidyapeetham, Amritapuri, Kerala, India
| |
Collapse
|
11
|
Piras A, Mastroleo F, Colciago RR, Morelli I, D'Aviero A, Longo S, Grassi R, Iorio GC, De Felice F, Boldrini L, Desideri I, Salvestrini V. How Italian radiation oncologists use ChatGPT: a survey by the young group of the Italian association of radiotherapy and clinical oncology (yAIRO). LA RADIOLOGIA MEDICA 2025; 130:453-462. [PMID: 39690359 DOI: 10.1007/s11547-024-01945-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2024] [Accepted: 12/11/2024] [Indexed: 12/19/2024]
Abstract
PURPOSE To investigate the awareness and the spread of ChatGPT and its possible role in both scientific research and clinical practice among the young radiation oncologists (RO). MATERIAL AND METHODS An anonymous, online survey via Google Forms (including 24 questions) was distributed among young (< 40 years old) ROs in Italy through the yAIRO network, from March 15, 2024, to 31, 2024. These ROs were officially registered with yAIRO in 2023. We particularly focused on the emerging use of ChatGPT and its future perspectives in clinical practice. RESULTS A total of 76 young physicians answered the survey. Seventy-three participants declared to be familiar with ChatGPT, and 71.1% of the surveyed physicians have already used ChatGPT. Thirty-one (40.8%) participants strongly agreed that AI has the potential to change the medical landscape in the future. Additionally, 79.1% of respondents agreed that AI will be mainly successful in research processes such as literature review and drafting articles/protocols. The belief in ChatGPT's potential results in direct use in daily practice in 43.4% of the cases, with mainly a fair grade of satisfaction (43.2%). A large part of participants (69.7%) believes in the implementation of ChatGPT into clinical practice, even though 53.9% fear an overall negative impact. CONCLUSIONS The results of the present survey clearly highlight the attitude of young Italian ROs toward the implementation of ChatGPT into clinical and academic RO practice. ChatGPT is considered a valuable and effective tool that can ease current and future workflows.
Collapse
Affiliation(s)
- Antonio Piras
- UO Radioterapia Oncologica, Villa Santa Teresa, 90011, Bagheria, Palermo, Italy
- Ri.Med Foundation, 90133, Palermo, Italy
- Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties, Molecular and Clinical Medicine, University of Palermo, 90127, Palermo, Italy
- Radiation Oncology, Mater Olbia Hospital, Olbia, Sassari, Italy
| | - Federico Mastroleo
- Division of Radiation Oncology, IEO European Institute of Oncology IRCCS, 20141, Milan, Italy
- Department of Oncology and Hemato-Oncology, University of Milan, 20141, Milan, Italy
| | - Riccardo Ray Colciago
- School of Medicine and Surgery, University of Milano Bicocca, Piazza Dell'Ateneo Nuovo, 1, 20126, Milan, Italy.
| | - Ilaria Morelli
- Radiation Oncology Unit, Department of Experimental and Clinical Biomedical Sciences, Azienda Ospedaliero-Universitaria Careggi, University of Florence, Florence, Italy
| | - Andrea D'Aviero
- Department of Radiation Oncology, "S.S Annunziata" Chieti Hospital, Chieti, Italy
- Department of Medical, Oral and Biotechnogical Sciences, "G.D'Annunzio" University of Chieti, Chieti, Italy
| | - Silvia Longo
- UOC Radioterapia Oncologica, Fondazione Policlinico Universitario "A. Gemelli" IRCCS, Rome, Italy
| | - Roberta Grassi
- Department of Precision Medicine, University of Campania "L. Vanvitelli", Naples, Italy
| | | | - Francesca De Felice
- Radiation Oncology, Policlinico Umberto I, Department of Radiological, Oncological and Pathological Sciences, "Sapienza" University of Rome, Rome, Italy
| | - Luca Boldrini
- UOC Radioterapia Oncologica, Fondazione Policlinico Universitario "A. Gemelli" IRCCS, Rome, Italy
- Università Cattolica del Sacro Cuore, Rome, Italy
| | - Isacco Desideri
- Radiation Oncology Unit, Department of Experimental and Clinical Biomedical Sciences, Azienda Ospedaliero-Universitaria Careggi, University of Florence, Florence, Italy
| | - Viola Salvestrini
- Radiation Oncology Unit, Department of Experimental and Clinical Biomedical Sciences, Azienda Ospedaliero-Universitaria Careggi, University of Florence, Florence, Italy
| |
Collapse
|
12
|
Ozkara BB, Boutet A, Comstock BA, Van Goethem J, Huisman TAGM, Ross JS, Saba L, Shah LM, Wintermark M, Castillo M. Artificial Intelligence-Generated Editorials in Radiology: Can Expert Editors Detect Them? AJNR Am J Neuroradiol 2025; 46:559-566. [PMID: 39288967 PMCID: PMC11979811 DOI: 10.3174/ajnr.a8505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Accepted: 09/16/2024] [Indexed: 09/20/2024]
Abstract
BACKGROUND AND PURPOSE Artificial intelligence is capable of generating complex texts that may be indistinguishable from those written by humans. We aimed to evaluate the ability of GPT-4 to write radiology editorials and to compare these with human-written counterparts, thereby determining their real-world applicability for scientific writing. MATERIALS AND METHODS Sixteen editorials from 8 journals were included. To generate the artificial intelligence (AI)-written editorials, the summary of 16 human-written editorials was fed into GPT-4. Six experienced editors reviewed the articles. First, an unpaired approach was used. The raters were asked to evaluate the content of each article by using a 1-5 Likert scale across specified metrics. Then, they determined whether the editorials were written by humans or AI. The articles were then evaluated in pairs to determine which article was generated by AI and which should be published. Finally, the articles were analyzed with an AI detector and for plagiarism. RESULTS The human-written articles had a median AI probability score of 2.0%, whereas the AI-written articles had 58%. The median similarity score among AI-written articles was 3%. Fifty-eight percent of unpaired articles were correctly classified regarding authorship. Rating accuracy was increased to 70% in the paired setting. AI-written articles received slightly higher scores in most metrics. When stratified by perception, human-written perceived articles were rated higher in most categories. In the paired setting, raters strongly preferred publishing the article they perceived as human-written (82%). CONCLUSIONS GPT-4 can write high-quality articles that iThenticate does not flag as plagiarized, which may go undetected by editors, and that detection tools can detect to a limited extent. Editors showed a positive bias toward human-written articles.
Collapse
Affiliation(s)
- Burak Berksu Ozkara
- From the Department of Neuroradiology (B.B.O., M.W.), The University of Texas MD Anderson Center, Houston, Texas
| | - Alexandre Boutet
- Joint Department of Medical Imaging (A.B.), University of Toronto, Toronto, Ontario, Canada
| | - Bryan A Comstock
- Department of Biostatistics (B.A.C.), University of Washington, Seattle, Washington
| | - Johan Van Goethem
- Department of Radiology (J.V.G.), Antwerp University Hospital, Antwerp, Belgium
| | - Thierry A G M Huisman
- Department of Radiology (T.A.G.M.H.), Texas Children's Hospital and Baylor College of Medicine, Houston, Texas
| | - Jeffrey S Ross
- Department of Radiology (J.S.R.), Mayo Clinic Arizona, Phoenix, Arizona
| | - Luca Saba
- Department of Radiology (L.S.), University of Cagliari, Cagliari, Italy
| | - Lubdha M Shah
- Department of Radiology (L.M.S.), University of Utah, Salt Lake City, Utah
| | - Max Wintermark
- From the Department of Neuroradiology (B.B.O., M.W.), The University of Texas MD Anderson Center, Houston, Texas
| | - Mauricio Castillo
- Department of Radiology (M.C.), University of North Carolina School of Medicine, Chapel Hill, North Carolina
| |
Collapse
|
13
|
Ekmekçi PE. Reflections on the “Ethics Guideline for using Generative Artificial Intelligence in Scientific Research and Publication Process of Higher Education Institutions”. Balkan Med J 2025; 42:174-175. [PMID: 39501521 PMCID: PMC11881520 DOI: 10.4274/balkanmedj.galenos.2024.2024-6-72] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2024] [Accepted: 09/12/2024] [Indexed: 03/05/2025] Open
Affiliation(s)
- Perihan Elif Ekmekçi
- Department of History of Medicine and Ethics TOBB ETÜ Faculty of Medicine, Ankara, Türkiye
| |
Collapse
|
14
|
Yin S, Huang S, Xue P, Xu Z, Lian Z, Ye C, Ma S, Liu M, Hu Y, Lu P, Li C. Generative artificial intelligence (GAI) usage guidelines for scholarly publishing: a cross-sectional study of medical journals. BMC Med 2025; 23:77. [PMID: 39934830 PMCID: PMC11816781 DOI: 10.1186/s12916-025-03899-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/15/2024] [Accepted: 01/23/2025] [Indexed: 02/13/2025] Open
Abstract
BACKGROUND Generative artificial intelligence (GAI) has developed rapidly and been increasingly used in scholarly publishing, so it is urgent to examine guidelines for its usage. This cross-sectional study aims to examine the coverage and type of recommendations of GAI usage guidelines among medical journals and how these factors relate to journal characteristics. METHODS From the SCImago Journal Rank (SJR) list for medicine in 2022, we generated two groups of journals: top SJR ranked journals (N = 200) and random sample of non-top SJR ranked journals (N = 140). For each group, we examined the coverage of author and reviewer guidelines across four categories: no guidelines, external guidelines only, own guidelines only, and own and external guidelines. We then calculated the number of recommendations by counting the number of usage recommendations for author and reviewer guidelines separately. Regression models examined the relationship of journal characteristics with the coverage and type of recommendations of GAI usage guidelines. RESULTS A higher proportion of top SJR ranked journals provided author guidelines compared to the random sample of non-top SJR ranked journals (95.0% vs. 86.7%, P < 0.01). The two groups of journals had the same median of 5 on a scale of 0 to 7 for author guidelines and a median of 1 on a scale of 0 to 2 for reviewer guidelines. However, both groups had lower percentages of journals providing recommendations for data analysis and interpretation, with the random sample of non-top SJR ranked journals having a significantly lower percentage (32.5% vs. 16.7%, P < 0.05). A higher SJR score was positively associated with providing GAI usage guidelines for both authors (all P < 0.01) and reviewers (all P < 0.01) among the random sample of non-top SJR ranked journals. CONCLUSIONS Although most medical journals provided their own GAI usage guidelines or referenced external guidelines, some recommendations remained unspecified (e.g., whether AI can be used for data analysis and interpretation). Additionally, journals with lower SJR scores were less likely to provide guidelines, indicating a potential gap that warrants attention. Collaborative efforts are needed to develop specific recommendations that better guide authors and reviewers.
Collapse
Affiliation(s)
- Shuhui Yin
- Applied Linguistics & Technology, Department of English, Iowa State University, Ames, IA, USA
| | - Simu Huang
- Center for Data Science, Zhejiang University, Hangzhou, Zhejiang, China
| | - Peng Xue
- Institute of Chinese Medical Sciences, University of Macau, Zhuhai, Macao SAR, China
- Centre for Pharmaceutical Regulatory Sciences, University of Macau, Zhuhai, Macao SAR, China
- Faculty of Health Sciences, University of Macau, Zhuhai, Macao SAR, China
| | - Zhuoran Xu
- Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Zi Lian
- Center for Health Equity & Urban Science Education, Teachers College, Columbia University, New York, NY, USA
| | - Chenfei Ye
- International Research Institute for Artificial Intelligence, Harbin Institute of Technology, Shenzhen, Guangdong, China
| | - Siyuan Ma
- Department of Communication, University of Macau, Zhuhai, Macao SAR, China
| | - Mingxuan Liu
- Department of Communication, University of Macau, Zhuhai, Macao SAR, China
| | - Yuanjia Hu
- Institute of Chinese Medical Sciences, University of Macau, Zhuhai, Macao SAR, China.
- Centre for Pharmaceutical Regulatory Sciences, University of Macau, Zhuhai, Macao SAR, China.
- Faculty of Health Sciences, University of Macau, Zhuhai, Macao SAR, China.
| | - Peiyi Lu
- Department of Social Work and Social Administration, University of Hong Kong, Hong Kong SAR, China.
| | - Chihua Li
- Institute of Chinese Medical Sciences, University of Macau, Zhuhai, Macao SAR, China.
- Centre for Pharmaceutical Regulatory Sciences, University of Macau, Zhuhai, Macao SAR, China.
- Faculty of Health Sciences, University of Macau, Zhuhai, Macao SAR, China.
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
15
|
Khojasteh L, Kafipour R, Pakdel F, Mukundan J. Empowering medical students with AI writing co-pilots: design and validation of AI self-assessment toolkit. BMC MEDICAL EDUCATION 2025; 25:159. [PMID: 39891148 PMCID: PMC11786331 DOI: 10.1186/s12909-025-06753-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2024] [Accepted: 01/23/2025] [Indexed: 02/03/2025]
Abstract
BACKGROUND AND OBJECTIVES Assessing and improving academic writing skills is a crucial component of higher education. To support students in this endeavor, a comprehensive self-assessment toolkit was developed to provide personalized feedback and guide their writing improvement. The current study aimed to rigorously evaluate the validity and reliability of this academic writing self-assessment toolkit. METHODS The development and validation of the academic writing self-assessment toolkit involved several key steps. First, a thorough review of the literature was conducted to identify the essential criteria for authentic assessment. Next, an analysis of medical students' reflection papers was undertaken to gain insights into their experiences using AI-powered tools for writing feedback. Based on these initial steps, a preliminary version of the self-assessment toolkit was devised. An expert focus group discussion was then convened to refine the questions and content of the toolkit. To assess content validity, the toolkit was evaluated by a panel of 22 medical student participants. They were asked to review each item and provide feedback on the relevance and comprehensiveness of the toolkit for evaluating academic writing skills. Face validity was also examined, with the students assessing the clarity, wording, and appropriateness of the toolkit items. RESULTS The content validity evaluation revealed that 95% of the toolkit items were rated as highly relevant, and 88% were deemed comprehensive in assessing key aspects of academic writing. Minor wording changes were suggested by the students to enhance clarity and interpretability. The face validity assessment found that 92% of the items were rated as unambiguous, with 90% considered appropriate and relevant for self-assessment. Feedback from the students led to the refinement of a few items to improve their clarity in the context of the Persian language. The robust reliability testing demonstrated the consistency and stability of the academic writing self-assessment toolkit in measuring students' writing skills over time. CONCLUSION The comprehensive evaluation process has established the academic writing self-assessment toolkit as a robust and credible instrument for supporting students' writing improvement. The toolkit's strong psychometric properties and user-centered design make it a valuable resource for enhancing academic writing skills in higher education.
Collapse
Affiliation(s)
- Laleh Khojasteh
- Department of English Language, School of Paramedical Sciences, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Reza Kafipour
- Department of English Language, School of Paramedical Sciences, Shiraz University of Medical Sciences, Shiraz, Iran.
| | - Farhad Pakdel
- Department of English Language, School of Paramedical Sciences, Shiraz University of Medical Sciences, Shiraz, Iran.
| | | |
Collapse
|
16
|
Nabata KJ, AlShehri Y, Mashat A, Wiseman SM. Evaluating human ability to distinguish between ChatGPT-generated and original scientific abstracts. Updates Surg 2025:10.1007/s13304-025-02106-3. [PMID: 39853655 DOI: 10.1007/s13304-025-02106-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2024] [Accepted: 01/14/2025] [Indexed: 01/26/2025]
Abstract
This study aims to analyze the accuracy of human reviewers in identifying scientific abstracts generated by ChatGPT compared to the original abstracts. Participants completed an online survey presenting two research abstracts: one generated by ChatGPT and one original abstract. They had to identify which abstract was generated by AI and provide feedback on their preference and perceptions of AI technology in academic writing. This observational cross-sectional study involved surgical trainees and faculty at the University of British Columbia. The survey was distributed to all surgeons and trainees affiliated with the University of British Columbia, which includes general surgery, orthopedic surgery, thoracic surgery, plastic surgery, cardiovascular surgery, vascular surgery, neurosurgery, urology, otolaryngology, pediatric surgery, and obstetrics and gynecology. A total of 41 participants completed the survey. 41 participants responded, comprising 10 (23.3%) surgeons. Eighteen (40.0%) participants correctly identified the original abstract. Twenty-six (63.4%) participants preferred the ChatGPT abstract (p = 0.0001). On multivariate analysis, preferring the original abstract was associated with correct identification of the original abstract [OR 7.46, 95% CI (1.78, 31.4), p = 0.006]. Results suggest that human reviewers cannot accurately distinguish between human and AI-generated abstracts, and overall, there was a trend toward a preference for AI-generated abstracts. The findings contributed to understanding the implications of AI in manuscript production, including its benefits and ethical considerations.
Collapse
Affiliation(s)
- Kylie J Nabata
- Department of Surgery, St. Paul's Hospital, 1081 Burrard St., Vancouver, BC, V6Z 1Y6, Canada
- University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Yasir AlShehri
- Department of Orthopaedic Surgery, Faculty of Medicine, The University of British Columbia, 2775 Laurel St., Vancouver, BC, V5Z 1M9, Canada
| | - Abdullah Mashat
- Department of Surgery, St. Paul's Hospital, 1081 Burrard St., Vancouver, BC, V6Z 1Y6, Canada
- University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Sam M Wiseman
- Department of Surgery, St. Paul's Hospital, 1081 Burrard St., Vancouver, BC, V6Z 1Y6, Canada.
- University of British Columbia, Vancouver, BC, V6T 1Z4, Canada.
| |
Collapse
|
17
|
Barbosa-Silva J, Driusso P, Ferreira EA, de Abreu RM. Exploring the Efficacy of Artificial Intelligence: A Comprehensive Analysis of CHAT-GPT's Accuracy and Completeness in Addressing Urinary Incontinence Queries. Neurourol Urodyn 2025; 44:153-164. [PMID: 39390731 DOI: 10.1002/nau.25603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Revised: 09/05/2024] [Accepted: 09/25/2024] [Indexed: 10/12/2024]
Abstract
BACKGROUND Artificial intelligence models are increasingly gaining popularity among patients and healthcare professionals. While it is impossible to restrict patient's access to different sources of information on the Internet, healthcare professional needs to be aware of the content-quality available across different platforms. OBJECTIVE To investigate the accuracy and completeness of Chat Generative Pretrained Transformer (ChatGPT) in addressing frequently asked questions related to the management and treatment of female urinary incontinence (UI), compared to recommendations from guidelines. METHODS This is a cross-sectional study. Two researchers developed 14 frequently asked questions related to UI. Then, they were inserted into the ChatGPT platform on September 16, 2023. The accuracy (scores from 1 to 5) and completeness (score from 1 to 3) of ChatGPT's answers were assessed individually by two experienced researchers in the Women's Health field, following the recommendations proposed by the guidelines for UI. RESULTS Most of the answers were classified as "more correct than incorrect" (n = 6), followed by "incorrect information than correct" (n = 3), "approximately equal correct and incorrect" (n = 2), "near all correct" (n = 2, and "correct" (n = 1). Regarding the appropriateness, most of the answers were classified as adequate, as they provided the minimum information expected to be classified as correct. CONCLUSION These results showed an inconsistency when evaluating the accuracy of answers generated by ChatGPT compared by scientific guidelines. Almost all the answers did not bring the complete content expected or reported in previous guidelines, which highlights to healthcare professionals and scientific community a concern about using artificial intelligence in patient counseling.
Collapse
Affiliation(s)
- Jordana Barbosa-Silva
- Women's Health Research Laboratory, Physical Therapy Department, Federal University of São Carlos, São Carlos, Brazil
| | - Patricia Driusso
- Women's Health Research Laboratory, Physical Therapy Department, Federal University of São Carlos, São Carlos, Brazil
| | - Elizabeth A Ferreira
- Department of Obstetrics and Gynecology, FMUSP School of Medicine, University of São Paulo, São Paulo, Brazil
- Department of Physiotherapy, Speech Therapy and Occupational Therapy, School of Medicine, University of São Paulo, São Paulo, Brazil
| | - Raphael M de Abreu
- Department of Physiotherapy, LUNEX University, International University of Health, Exercise & Sports S.A., Differdange, Luxembourg
- LUNEX ASBL Luxembourg Health & Sport Sciences Research Institute, Differdange, Luxembourg
| |
Collapse
|
18
|
Ahmed A, Fatani D, Vargas JM, Almutlak M, Bin Helayel H, Fairaq R, Alabdulhadi H. Physicians' Perspectives on ChatGPT in Ophthalmology: Insights on Artificial Intelligence (AI) Integration in Clinical Practice. Cureus 2025; 17:e78069. [PMID: 40013176 PMCID: PMC11864167 DOI: 10.7759/cureus.78069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/26/2025] [Indexed: 02/28/2025] Open
Abstract
To obtain detailed data on the acceptance of an artificial intelligence chatbot (ChatGPT; OpenAI, San Francisco, CA, USA) in ophthalmology among physicians, a survey explored physician responses regarding using ChatGPT in ophthalmology. The survey included questions about the applications of ChatGPT in ophthalmology, future concerns such as job replacement or automation, research, medical education, patient education, ethical concerns, and implementation in practice. One hundred ninety-nine ophthalmic surgeons participated in this study. Approximately two-thirds of the participants had 15 years or more experience in ophthalmology. One hundred sixteen reported that they had used ChatGPT. We found no difference in age, gender, or level of experience between those who used or did not use ChatGPT. ChatGPT users tend to consider ChatGPT and artificial intelligence (AI) as useful in ophthalmology (P=0.001). Both users and non-users think that AI is useful for identifying early signs of eye disease, providing decision support in treatment planning, monitoring patient progress, answering patient questions, and scheduling appointments. Both users and non-users believe there are some issues related to the use of AI in health care, such as liability issues, privacy concerns, accuracy of diagnosis, trust of the chatbot, ethical issues, and information bias. The use of ChatGPT and other forms of AI is increasingly becoming accepted among ophthalmologists. AI is seen as a helpful tool for improving patient education, decision support, and medical services, but there are also concerns regarding privacy and job displacement, which warrant human oversight.
Collapse
Affiliation(s)
- Anwar Ahmed
- Research, King Khaled Eye Specialist Hospital, Riyadh, SAU
| | - Dalal Fatani
- Oculoplastic and Orbit, King Khaled Eye Specialist Hospital, Riyadh, SAU
| | - Jose M Vargas
- Ophthalmology, King Abdullah Bin Abdulaziz University Hospital, Riyadh, SAU
| | - Mohammed Almutlak
- Anterior Segment Division, King Khaled Eye Specialist Hospital, Riyadh, SAU
| | - Halah Bin Helayel
- Anterior Segment Division, King Khaled Eye Specialist Hospital, Riyadh, SAU
| | - Rafah Fairaq
- Anterior Segment Division, King Khaled Eye Specialist Hospital, Riyadh, SAU
| | - Halla Alabdulhadi
- Anterior Segment Division, King Khaled Eye Specialist Hospital, Riyadh, SAU
| |
Collapse
|
19
|
Porto JR, Morgan KA, Hecht CJ, Burkhart RJ, Liu RW. Quantifying the Scope of Artificial Intelligence-Assisted Writing in Orthopaedic Medical Literature: An Analysis of Prevalence and Validation of AI-Detection Software. J Am Acad Orthop Surg 2025; 33:42-50. [PMID: 39602700 DOI: 10.5435/jaaos-d-24-00084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Accepted: 08/12/2024] [Indexed: 11/29/2024] Open
Abstract
INTRODUCTION The popularization of generative artificial intelligence (AI), including Chat Generative Pre-trained Transformer (ChatGPT), has raised concerns for the integrity of academic literature. This study asked the following questions: (1) Has the popularization of publicly available generative AI, such as ChatGPT, increased the prevalence of AI-generated orthopaedic literature? (2) Can AI detectors accurately identify ChatGPT-generated text? (3) Are there associations between article characteristics and the likelihood that it was AI generated? METHODS PubMed was searched across six major orthopaedic journals to identify articles received for publication after January 1, 2023. Two hundred and forty articles were randomly selected and entered into three popular AI detectors. Twenty articles published by each journal before the release of ChatGPT were randomly selected as negative control articles. 36 positive control articles (6 per journal) were created by altering 25%, 50%, and 100% of text from negative control articles using ChatGPT and were then used to validate each detector. The mean percentage of text detected as written by AI per detector was compared between pre-ChatGPT and post-ChatGPT release articles using independent t -test. Multivariate regression analysis was conducted using percentage AI-generated text per journal, article type (ie, cohort, clinical trial, review), and month of submission. RESULTS One AI detector consistently and accurately identified AI-generated text in positive control articles, whereas two others showed poor sensitivity and specificity. The most accurate detector showed a modest increase in the percentage AI detected for the articles received post release of ChatGPT (+1.8%, P = 0.01). Regression analysis showed no consistent associations between likelihood of AI-generated text per journal, article type, or month of submission. CONCLUSIONS As this study found an early, albeit modest, effect of generative AI on the orthopaedic literature, proper oversight will play a critical role in maintaining research integrity and accuracy. AI detectors may play a critical role in regulatory efforts, although they will require further development and standardization to the interpretation of their results.
Collapse
Affiliation(s)
- Joshua R Porto
- From the Department of Orthopaedic Surgery, University Hospitals of Cleveland, Case Western Reserve University, Cleveland, OH (Porto, Morgan, Hecht, Burkhart, and Liu), and the Case Western Reserve University School of Medicine, Cleveland, OH (Porto, Morgan, and Hecht)
| | | | | | | | | |
Collapse
|
20
|
Gleasner RM, Sood A. Special Issues: The roles of special issues in scholarly communication in a changing publishing landscape. LEARNED PUBLISHING 2025; 38:e1635. [PMID: 39734329 PMCID: PMC11671123 DOI: 10.1002/leap.1635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 10/02/2024] [Indexed: 12/31/2024]
Abstract
This paper aims to enhance the understanding of the role of special issues in the evolving landscape of academic publishing, offering insights for publishers, editors, guest editors, and researchers, including how new technologies influence transparency in publishing processes, open access models, and metrics for success. Based upon original analysis, the paper also discusses the importance of special issues and opportunities to support diversity, equity, and inclusivity in special issue publishing programs. The goal is to contribute to the discussion of maintaining research integrity through special issues, acknowledging their significance in scholarly communication, while offering suggestions for the future.
Collapse
Affiliation(s)
- Robyn M Gleasner
- University of New Mexico Health Sciences Library and Informatics Center
| | | |
Collapse
|
21
|
Molligan J, Pérez-López E. Artificial intelligence in academia: opportunities, challenges, and ethical considerations. Biochem Cell Biol 2025; 103:1-3. [PMID: 39611424 DOI: 10.1139/bcb-2024-0216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2024] Open
Affiliation(s)
- Joshua Molligan
- Département de phytologie, Faculté des sciences de l'agriculture et de l'alimentation, Université Laval, Québec City, QC, Canada
- Centre de recherche et d'innovation sur les végétaux, Université Laval, Québec City, QC, Canada
- Institute de Biologie Intégrative et des Systèmes, Université Laval, Québec City, QC, Canada
- L'Institute EDS, Université Laval, Québec City, QC, Canada
| | - Edel Pérez-López
- Département de phytologie, Faculté des sciences de l'agriculture et de l'alimentation, Université Laval, Québec City, QC, Canada
- Centre de recherche et d'innovation sur les végétaux, Université Laval, Québec City, QC, Canada
- Institute de Biologie Intégrative et des Systèmes, Université Laval, Québec City, QC, Canada
- L'Institute EDS, Université Laval, Québec City, QC, Canada
| |
Collapse
|
22
|
Kabir A, Shah S, Haddad A, Raper DMS. Introducing Our Custom GPT: An Example of the Potential Impact of Personalized GPT Builders on Scientific Writing. World Neurosurg 2025; 193:461-468. [PMID: 39442688 DOI: 10.1016/j.wneu.2024.10.041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2024] [Revised: 10/10/2024] [Accepted: 10/14/2024] [Indexed: 10/25/2024]
Abstract
BACKGROUND The rapid progression of artificial intelligence (AI) and large language models (LLMs), such as ChatGPT, has contributed to increase its utility and popularity in various fields. Discourse about AI's potential role in different aspects of scientific literature such as writing, data analysis, and literature review, is growing as the programs continue to improve their capabilities. This study utilizes a recently released ChatGPT tool that allows users to create customized GPTs to highlight the potential of customizable GPTs tailored to prepare and write research manuscripts. METHODS We developed our 2 GPTs, Neurosurgical Research Paper Writer and Medi Research Assistant, through iterative refinement of ChatGPT 4.0's tool, GPT Builder. This process included providing specific and thorough instructions along with repetitive testing and feedback-driven adjustments to finalize a version of the model that fit our needs. RESULTS The GPT models that we created were able to efficiently and consistently produce accurate outputs from inputted prompts based on their specific configurations. It effectively analyzed existing literature that it found and synthesized information in ways that were reliable and written in ways comparable to manuscripts authored by scientific professionals. CONCLUSIONS While the ability of modern AI to generate scientific manuscripts has shown significant progress, the persistence of fallacies and miscalculations suggest that the development of GPTs requires extensive calibration before achieving greater reliability and consistency. Nevertheless, the prospective horizon of AI-driven research holds promise in streamlining the publication workflow and increasing accessibility to novel research.
Collapse
Affiliation(s)
- Aymen Kabir
- Department of Neurological Surgery, University of California, San Francisco, California, USA
| | - Suraj Shah
- University of California, Berkeley, California, USA
| | - Alexander Haddad
- Department of Neurological Surgery, University of California, San Francisco, California, USA
| | - Daniel M S Raper
- Department of Neurological Surgery, University of California, San Francisco, California, USA.
| |
Collapse
|
23
|
Dumkrieger GM, Chiang CC, Zhang P, Minen MT, Cohen F, Hranilovich JA. Artificial intelligence terminology, methodology, and critical appraisal: A primer for headache clinicians and researchers. Headache 2025; 65:180-190. [PMID: 39658951 PMCID: PMC11840968 DOI: 10.1111/head.14880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Revised: 10/16/2024] [Accepted: 10/19/2024] [Indexed: 12/12/2024]
Abstract
OBJECTIVE The goal is to provide an overview of artificial intelligence (AI) and machine learning (ML) methodology and appraisal tailored to clinicians and researchers in the headache field to facilitate interdisciplinary communications and research. BACKGROUND The application of AI to the study of headache and other healthcare challenges is growing rapidly. It is critical that these findings be accurately interpreted by headache specialists, but this can be difficult for non-AI specialists. METHODS This paper is a narrative review of the fundamentals required to understand ML/AI headache research. Using guidance from key leaders in the field of headache medicine and AI, important references were reviewed and cited to provide a comprehensive overview of the terminology, methodology, applications, pitfalls, and bias of AI. RESULTS We review how AI models are created, common model types, methods for evaluation, and examples of their application to headache medicine. We also highlight potential pitfalls relevant when consuming AI research, and discuss ethical issues of bias, privacy and abuse generated by AI. Additionally, we highlight recent related research from across headache-related applications. CONCLUSION Many promising current and future applications of ML and AI exist in the field of headache medicine. Understanding the fundamentals of AI will allow readers to understand and critically appraise AI-related research findings in their proper context. This paper will increase the reader's comfort in consuming AI/ML-based research and will prepare them to think critically about related research developments.
Collapse
Affiliation(s)
| | | | - Pengfei Zhang
- Rutgers Robert Wood Johnson Medical School, New Brunswick, New Jersey, USA
| | - Mia T Minen
- Department of Neurology, NYU Langone Health, New York, New York, USA
- Department of Population Health, NYU Langone Health, New York, New York, USA
| | - Fred Cohen
- Department of Neurology, Mount Sinai Hospital, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Medicine, Mount Sinai Hospital, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Jennifer A Hranilovich
- Division of Child Neurology, Department of Pediatrics, University of Colorado School of Medicine, Aurora, Colorado, USA
| |
Collapse
|
24
|
Beg MJ. Responsible AI Integration in Mental Health Research: Issues, Guidelines, and Best Practices. Indian J Psychol Med 2025; 47:5-8. [PMID: 39650770 PMCID: PMC11624515 DOI: 10.1177/02537176241302898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/11/2024] Open
Affiliation(s)
- Mirza Jahanzeb Beg
- Dept. of Psychology, Kumaraguru College of Liberal Arts and Science, Coimbatore, Tamil Nadu, India
| |
Collapse
|
25
|
Wong J, Kriegler C, Shrivastava A, Duimering A, Le C. Utility of Chatbot Literature Search in Radiation Oncology. JOURNAL OF CANCER EDUCATION : THE OFFICIAL JOURNAL OF THE AMERICAN ASSOCIATION FOR CANCER EDUCATION 2024:10.1007/s13187-024-02547-1. [PMID: 39673022 DOI: 10.1007/s13187-024-02547-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 11/23/2024] [Indexed: 12/15/2024]
Abstract
Artificial intelligence and natural language processing tools have shown promise in oncology by assisting with medical literature retrieval and providing patient support. The potential for these technologies to generate inaccurate yet seemingly correct information poses significant challenges. This study evaluates the effectiveness, benefits, and limitations of ChatGPT for clinical use in conducting literature reviews of radiation oncology treatments. This cross-sectional study used ChatGPT version 3.5 to generate literature searches on radiotherapy options for seven tumor sites, with prompts issued five times per site to generate up to 50 publications per tumor type. The publications were verified using the Scopus database and categorized as correct, irrelevant, or non-existent. Statistical analysis with one-way ANOVA compared the impact factors and citation counts across different tumor sites. Among the 350 publications generated, there were 44 correct, 298 non-existent, and 8 irrelevant papers. The average publication year of all generated papers was 2011, compared to 2009 for the correct papers. The average impact factor of all generated papers was 38.8, compared to 113.8 for the correct papers. There were significant differences in the publication year, impact factor, and citation counts between tumor sites for both correct and non-existent papers. Our study highlights both the potential utility and significant limitations of using AI, specifically ChatGPT 3.5, in radiation oncology literature reviews. The findings emphasize the need for verification of AI outputs, development of standardized quality assurance protocols, and continued research into AI biases to ensure reliable integration into clinical practice.
Collapse
Affiliation(s)
- Justina Wong
- Faculty of Medicine and Dentistry, University of Alberta, Edmonton, AB, Canada
| | - Conley Kriegler
- Division of Radiation Oncology, Department of Oncology, University of Alberta, Cross Cancer Institute, 11560 University Ave, Edmonton, AB, T6G 1Z2, Canada
| | - Ananya Shrivastava
- Faculty of Medicine and Dentistry, University of Alberta, Edmonton, AB, Canada
| | - Adele Duimering
- Division of Radiation Oncology, Department of Oncology, University of Alberta, Cross Cancer Institute, 11560 University Ave, Edmonton, AB, T6G 1Z2, Canada
| | - Connie Le
- Division of Radiation Oncology, Department of Oncology, University of Alberta, Cross Cancer Institute, 11560 University Ave, Edmonton, AB, T6G 1Z2, Canada.
| |
Collapse
|
26
|
Ahn S. Large language model usage guidelines in Korean medical journals: a survey using human-artificial intelligence collaboration. JOURNAL OF YEUNGNAM MEDICAL SCIENCE 2024; 42:14. [PMID: 39659196 PMCID: PMC11812075 DOI: 10.12701/jyms.2024.00794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Revised: 10/31/2024] [Accepted: 11/21/2024] [Indexed: 12/12/2024]
Abstract
BACKGROUND Large language models (LLMs), the most recent advancements in artificial intelligence (AI), have profoundly affected academic publishing and raised important ethical and practical concerns. This study examined the prevalence and content of AI guidelines in Korean medical journals to assess the current landscape and inform future policy implementation. METHODS The top 100 Korean medical journals determined by Hirsh index were surveyed. Author guidelines were collected and screened by a human researcher and AI chatbot to identify AI-related content. The key components of LLM policies were extracted and compared across journals. The journal characteristics associated with the adoption of AI guidelines were also analyzed. RESULTS Only 18% of the surveyed journals had LLM guidelines, which is much lower than previously reported in international journals. However, the adoption rates increased over time, reaching 57.1% in the first quarter of 2024. High-impact journals were more likely to have AI guidelines. All journals with LLM guidelines required authors to declare LLM tool use and 94.4% prohibited AI authorship. The key policy components included emphasizing human responsibility (72.2%), discouraging AI-generated content (44.4%), and exempting basic AI tools (38.9%). CONCLUSION While the adoption of LLM guidelines among Korean medical journals is lower than the global trend, there has been a clear increase in implementation over time. The key components of these guidelines align with international standards, but greater standardization and collaboration are needed to ensure the responsible and ethical use of LLMs in medical research and writing.
Collapse
Affiliation(s)
- Sangzin Ahn
- Department of Pharmacology and PharmacoGenomics Research Center, Inje University College of Medicine, Busan, Korea
- Center for Personalized Precision Medicine of Tuberculosis, Inje University College of Medicine, Busan, Korea
| |
Collapse
|
27
|
Hostetter L, Kelm D, Nelson D. Ethics of Writing Personal Statements and Letters of Recommendations with Large Language Models. ATS Sch 2024; 5:486-491. [PMID: 39822218 PMCID: PMC11734674 DOI: 10.34197/ats-scholar.2024-0038ps] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Accepted: 06/17/2024] [Indexed: 01/19/2025] Open
Abstract
Large language models are becoming ubiquitous in the editing and generation of written content and are actively being explored for their use in medical education. The use of artificial intelligence (AI) engines to generate content in academic spaces is controversial and has been meet with swift responses and guidance from academic journals and publishers regarding the appropriate use or disclosure of use of AI engines in professional writing. To date, there is no guidance to applicants of graduate medical education programs in using AI engines to generate application content-primarily personal statements and letters of recommendation. In this Perspective, we review perceptions of using AI to generate application content, considerations for the impact of AI in holistic application review, ethical challenges regarding plagiarism, and AI text classifiers. Finally, included are recommendations to the graduate medical education community to provide guidance on use of AI engines in applications to maintain the integrity of the application process in graduate medical education.
Collapse
Affiliation(s)
- Logan Hostetter
- Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, Mayo Clinic, Rochester, Minnesota
| | - Diana Kelm
- Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, Mayo Clinic, Rochester, Minnesota
| | - Darlene Nelson
- Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, Mayo Clinic, Rochester, Minnesota
| |
Collapse
|
28
|
Nian PP, Saleet J, Magruder M, Wellington IJ, Choueka J, Houten JK, Saleh A, Razi AE, Ng MK. ChatGPT as a Source of Patient Information for Lumbar Spinal Fusion and Laminectomy: A Comparative Analysis Against Google Web Search. Clin Spine Surg 2024; 37:E394-E403. [PMID: 38409676 DOI: 10.1097/bsd.0000000000001582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 01/22/2024] [Indexed: 02/28/2024]
Abstract
STUDY DESIGN Retrospective Observational Study. OBJECTIVE The objective of this study was to assess the utility of ChatGPT, an artificial intelligence chatbot, in providing patient information for lumbar spinal fusion and lumbar laminectomy in comparison with the Google search engine. SUMMARY OF BACKGROUND DATA ChatGPT, an artificial intelligence chatbot with seemingly unlimited functionality, may present an alternative to a Google web search for patients seeking information about medical questions. With widespread misinformation and suboptimal quality of online health information, it is imperative to assess ChatGPT as a resource for this purpose. METHODS The first 10 frequently asked questions (FAQs) related to the search terms "lumbar spinal fusion" and "lumbar laminectomy" were extracted from Google and ChatGPT. Responses to shared questions were compared regarding length and readability, using the Flesch Reading Ease score and Flesch-Kincaid Grade Level. Numerical FAQs from Google were replicated in ChatGPT. RESULTS Two of 10 (20%) questions for both lumbar spinal fusion and lumbar laminectomy were asked similarly between ChatGPT and Google. Compared with Google, ChatGPT's responses were lengthier (340.0 vs. 159.3 words) and of lower readability (Flesch Reading Ease score: 34.0 vs. 58.2; Flesch-Kincaid grade level: 11.6 vs. 8.8). Subjectively, we evaluated these responses to be accurate and adequately nonspecific. Each response concluded with a recommendation to discuss further with a health care provider. Over half of the numerical questions from Google produced a varying or nonnumerical response in ChatGPT. CONCLUSIONS FAQs and responses regarding lumbar spinal fusion and lumbar laminectomy were highly variable between Google and ChatGPT. While ChatGPT may be able to produce relatively accurate responses in select questions, its role remains as a supplement or starting point to a consultation with a physician, not as a replacement, and should be taken with caution until its functionality can be validated.
Collapse
Affiliation(s)
- Patrick P Nian
- Departments of Orthopaedic Surgery, SUNY Downstate Health Sciences University, College of Medicine, Brooklyn, NY
| | | | | | | | | | - John K Houten
- Department of Neurosurgery, Icahn School of Medicine at Mount Sinai, New York, NY
| | | | | | | |
Collapse
|
29
|
Soulage CO, Van Coppenolle F, Guebre-Egziabher F. The conversational AI "ChatGPT" outperforms medical students on a physiology university examination. ADVANCES IN PHYSIOLOGY EDUCATION 2024; 48:677-684. [PMID: 38991037 DOI: 10.1152/advan.00181.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 06/13/2024] [Accepted: 07/09/2024] [Indexed: 07/13/2024]
Abstract
Artificial intelligence (AI) has gained massive interest with the public release of the conversational AI "ChatGPT," but it also has become a matter of concern for academia as it can easily be misused. We performed a quantitative evaluation of the performance of ChatGPT on a medical physiology university examination. Forty-one answers were obtained with ChatGPT and compared to the results of 24 students. The results of ChatGPT were significantly better than those of the students; the median (IQR) score was 75% (66-84%) for the AI compared to 56% (43-65%) for students (P < 0.001). The exam success rate was 100% for ChatGPT, whereas 29% (n = 7) of students failed. ChatGPT could promote plagiarism and intellectual laziness among students and could represent a new and easy way to cheat, especially when evaluations are performed online. Considering that these powerful AI tools are now freely available, scholars should take great care to construct assessments that really evaluate student reflection skills and prevent AI-assisted cheating.NEW & NOTEWORTHY The release of the conversational artificial intelligence (AI) ChatGPT has become a matter of concern for academia as it can easily be misused by students for cheating purposes. We performed a quantitative evaluation of the performance of ChatGPT on a medical physiology university examination and observed that ChatGPT outperforms medical students obtaining significantly better grades. Scholars should therefore take great care to construct assessments crafted to really evaluate the student reflection skills and prevent AI-assisted cheating.
Collapse
Affiliation(s)
- Christophe O Soulage
- CarMeN, INSERM U1060, INRAe U1397, Université Claude Bernard Lyon 1, Bron, France
| | | | - Fitsum Guebre-Egziabher
- CarMeN, INSERM U1060, INRAe U1397, Université Claude Bernard Lyon 1, Bron, France
- Department of Nephrology, Groupement Hospitalier Centre, Hospices Civils de Lyon, Hôpital E. Herriot, Lyon, France
| |
Collapse
|
30
|
Chauhan C, Currie G. The Impact of Generative Artificial Intelligence on Research Integrity in Scholarly Publishing. THE AMERICAN JOURNAL OF PATHOLOGY 2024; 194:2234-2238. [PMID: 39396568 DOI: 10.1016/j.ajpath.2024.10.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2024] [Accepted: 10/02/2024] [Indexed: 10/15/2024]
Affiliation(s)
- Chhavi Chauhan
- American Society for Investigative Pathology, Rockville, Maryland
| | - George Currie
- eLife Sciences Publications, Ltd., Oxford, United Kingdom.
| |
Collapse
|
31
|
Patel SJ, Notarianni AP, Martin AK, Tsai A, Pulton DA, Linganna RE, Bhatte S, Montealegre-Gallegos M, Patel B, Waldron NH, Nimma SR, Kothari P, Kiwakyou L, Baskin SM, Feinman JW. The Year in Graduate Medical Education: Selected Highlights from 2023. J Cardiothorac Vasc Anesth 2024; 38:2906-2914. [PMID: 39261208 DOI: 10.1053/j.jvca.2024.05.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Revised: 05/01/2024] [Accepted: 05/02/2024] [Indexed: 09/13/2024]
Abstract
This special article is the third in an annual series of the Journal of Cardiothoracic and Vascular Anesthesia that highlights significant literature from the world of graduate medical education published over the past year. Major themes addressed in this review include the potential uses and pitfalls of artificial intelligence in graduate medical education, trainee well-being and the rise of unionized house staff, the effect of gender and race/ethnicity on residency application and attrition rates, and the adoption of novel technologies in medical simulation and education. The authors thank the editorial board for again allowing us to draw attention to some of the more interesting work published in the field of graduate medical education during 2023. We hope that the readers find these highlights thought-provoking and informative as we all strive to successfully educate the next generation of anesthesiologists.
Collapse
Affiliation(s)
- Saumil J Patel
- Department of Anesthesiology and Critical Care, Perelman School of Medicine, Philadelphia, PA
| | - Andrew P Notarianni
- Cardiothoracic Division, Department of Anesthesiology, Yale University School of Medicine, New Haven, CT
| | - Archer Kilbourne Martin
- Division of Cardiovascular and Thoracic Anesthesiology, Department of Anesthesiology and Perioperative Medicine, Mayo Clinic, Jacksonville, FL
| | - Albert Tsai
- Division of Cardiothoracic Anesthesiology, Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University School of Medicine, Stanford, CA
| | - Danielle A Pulton
- Department of Anesthesiology, Temple University Hospital/Lewis Katz School of Medicine, Philadelphia, PA
| | - Regina E Linganna
- Department of Anesthesiology and Critical Care, Perelman School of Medicine, Philadelphia, PA
| | - Sai Bhatte
- Perelman School of Medicine, Philadelphia, PA
| | - Mario Montealegre-Gallegos
- Cardiothoracic Division, Department of Anesthesiology, Yale University School of Medicine, New Haven, CT
| | - Bhoumesh Patel
- Cardiothoracic Division, Department of Anesthesiology, Yale University School of Medicine, New Haven, CT
| | - Nathan H Waldron
- Division of Cardiovascular and Thoracic Anesthesiology, Department of Anesthesiology and Perioperative Medicine, Mayo Clinic, Jacksonville, FL
| | - Sindhuja R Nimma
- Division of Cardiovascular and Thoracic Anesthesiology, Department of Anesthesiology and Perioperative Medicine, Mayo Clinic, Jacksonville, FL
| | - Perin Kothari
- Division of Cardiothoracic Anesthesiology, Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University School of Medicine, Stanford, CA
| | - Larissa Kiwakyou
- Division of Cardiothoracic Anesthesiology, Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University School of Medicine, Stanford, CA
| | - Sean M Baskin
- Department of Anesthesiology, Temple University Hospital/Lewis Katz School of Medicine, Philadelphia, PA
| | - Jared W Feinman
- Department of Anesthesiology and Critical Care, Perelman School of Medicine, Philadelphia, PA.
| |
Collapse
|
32
|
Simsek O, Manteghinejad A, Vossough A. A Comparative Review of Imaging Journal Policies for Use of AI in Manuscript Generation. Acad Radiol 2024; 31:5232-5236. [PMID: 38772797 DOI: 10.1016/j.acra.2024.05.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 05/06/2024] [Accepted: 05/06/2024] [Indexed: 05/23/2024]
Abstract
RATIONALE AND OBJECTIVES Artificial intelligence (AI) technologies are rapidly evolving and offering new advances almost on a day-by-day basis, including various tools for manuscript generation and modification. On the other hand, these potentially time- and effort-saving solutions come with potential bias, factual error, and plagiarism risks. Some journals have started to update their author guidelines in reference to AI-generated or AI-assisted manuscripts. The purpose of this paper is to evaluate author guidelines for including AI use policies in radiology journals and compare scientometric data between journals with and without explicit AI use policies. MATERIALS AND METHODS This cross-sectional study included 112 MEDLINE-indexed imaging journals and evaluated their author guidelines between 13 October 2023 and 16 October 2023. Journals were identified based on subject matter and association with a radiological society. The authors' guidelines and editorial policies were evaluated for the use of AI in manuscript preparation and specific AI-generated image policies. We assessed the existence of an AI usage policy among subspecialty imaging journals. The scientometric scores of journals with and without AI use policies were compared using the Wilcoxon signed-rank test. RESULTS Among 112 MEDLINE-indexed radiology journals, 80 journals were affiliated with an imaging society, and 32 were not. 69 (61.6%) of 112 imaging journals had an AI usage policy, and 40 (57.9%) of 69 mentioned a specific policy about AI-generated figures. CiteScore (4.9 vs 4, p = 0.023), Source Normalized Impact per Paper (1.12 vs 0.83, p = 0.06), Scientific Journal Ranking (0.75 vs 0.54, p = 0.010) and Journal Citation Indicator (0.77 vs 0.62, p = 0.038) were significantly higher in journals with an AI policy. CONCLUSION The majority of imaging journals provide guidelines for AI-generated content, but still, a substantial number of journals do not have AI usage policies or do not require disclosure for non-human-created manuscripts. Journals with an established AI policy had higher citation and impact scores.
Collapse
Affiliation(s)
- Onur Simsek
- Division of Neuroradiology, Department of Radiology, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA (O.S., A.M., A.V.).
| | - Amirreza Manteghinejad
- Division of Neuroradiology, Department of Radiology, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA (O.S., A.M., A.V.)
| | - Arastoo Vossough
- Division of Neuroradiology, Department of Radiology, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA (O.S., A.M., A.V.); Department of Radiology, Children's Hospital of Philadelphia, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA (A.V.)
| |
Collapse
|
33
|
Khalifa AA, Ibrahim MA. Artificial intelligence (AI) and ChatGPT involvement in scientific and medical writing, a new concern for researchers. A scoping review. ARAB GULF JOURNAL OF SCIENTIFIC RESEARCH 2024; 42:1770-1787. [DOI: 10.1108/agjsr-09-2023-0423] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
PurposeThe study aims to evaluate PubMed publications on ChatGPT or artificial intelligence (AI) involvement in scientific or medical writing and investigate whether ChatGPT or AI was used to create these articles or listed as authors.Design/methodology/approachThis scoping review was conducted according to Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews (PRISMA-ScR) guidelines. A PubMed database search was performed for articles published between January 1 and November 29, 2023, using appropriate search terms; both authors performed screening and selection independently.FindingsFrom the initial search results of 127 articles, 41 were eligible for final analysis. Articles were published in 34 journals. Editorials were the most common article type, with 15 (36.6%) articles. Authors originated from 27 countries, and authors from the USA contributed the most, with 14 (34.1%) articles. The most discussed topic was AI tools and writing capabilities in 19 (46.3%) articles. AI or ChatGPT was involved in manuscript preparation in 31 (75.6%) articles. None of the articles listed AI or ChatGPT as an author, and in 19 (46.3%) articles, the authors acknowledged utilizing AI or ChatGPT.Practical implicationsResearchers worldwide are concerned with AI or ChatGPT involvement in scientific research, specifically the writing process. The authors believe that precise and mature regulations will be developed soon by journals, publishers and editors, which will pave the way for the best usage of these tools.Originality/valueThis scoping review expressed data published on using AI or ChatGPT in various scientific research and writing aspects, besides alluding to the advantages, disadvantages and implications of their usage.
Collapse
|
34
|
Iftikhar H, Anjum S, Bhutta ZA, Najam M, Bashir K. Performance of ChatGPT in emergency medicine residency exams in Qatar: A comparative analysis with resident physicians. Qatar Med J 2024; 2024:61. [PMID: 39552949 PMCID: PMC11568194 DOI: 10.5339/qmj.2024.61] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 09/09/2024] [Indexed: 11/19/2024] Open
Abstract
Introduction The inclusion of artificial intelligence (AI) in the healthcare sector has transformed medical practices by introducing innovative techniques for medical education, diagnosis, and treatment strategies. In medical education, the potential of AI to enhance learning and assessment methods is being increasingly recognized. This study aims to evaluate the performance of OpenAI's Chat Generative Pre-Trained Transformer (ChatGPT) in emergency medicine (EM) residency examinations in Qatar and compare it with the performance of resident physicians. Methods A retrospective descriptive study with a mixed-methods design was conducted in August 2023. EM residents' examination scores were collected and compared with the performance of ChatGPT on the same examinations. The examinations consisted of multiple-choice questions (MCQs) from the same faculty responsible for Qatari Board EM examinations. ChatGPT's performance on these examinations was analyzed and compared with residents across various postgraduate years (PGY). Results The study included 238 emergency department residents from PGY1 to PGY4 and compared their performances with ChatGPT. ChatGPT scored consistently higher than resident groups in all examination categories. However, a notable decline in passing rates was observed among senior residents, indicating a potential misalignment between examination performance and practical competencies. Another likely reason can be the impact of the COVID-19 pandemic on their learning experience, knowledge acquisition, and consolidation. Conclusion ChatGPT demonstrated significant proficiency in the theoretical knowledge of EM, outperforming resident physicians in examination settings. This finding suggests the potential of AI as a supplementary tool in medical education.
Collapse
Affiliation(s)
- Haris Iftikhar
- Emergency Medicine, Hamad General Hospital, Doha, Qatar *
| | - Shahzad Anjum
- Emergency Medicine, Hamad General Hospital, Doha, Qatar *
| | - Zain A Bhutta
- Emergency Medicine, Hamad General Hospital, Doha, Qatar *
| | - Mavia Najam
- Department of Medical Education, Hamad Medical Corporation, Doha, Qatar
| | - Khalid Bashir
- Emergency Medicine, Hamad General Hospital, Doha, Qatar *
| |
Collapse
|
35
|
Bhargava M, Bhardwaj P, Dasgupta R. Artificial Intelligence in Biomedical Research and Publications: It is not about Good or Evil but about its Ethical Use. Indian J Community Med 2024; 49:777-779. [PMID: 39668918 PMCID: PMC11633261 DOI: 10.4103/ijcm.ijcm_560_24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2024] [Accepted: 08/09/2024] [Indexed: 12/14/2024] Open
Affiliation(s)
- Madhavi Bhargava
- Department of Community Medicine, Yenepoya Medical College, Mangalore, Karnataka, India
- Center for Nutrition Studies, Yenepoya (Deemed to be University), Mangalore, Karnataka, India
| | - Pankaj Bhardwaj
- School of Public Health (SPH), Community Medicine and Family Medicine, AIIMS Jodhpur, Rajasthan, India
| | - Rajib Dasgupta
- Centre of Social Medicine and Community Health, JNU, Delhi, India
| |
Collapse
|
36
|
Mehta H, Bishnoi A, Reddy A, Vinay K. ChatGPT and academic publishing: Potential and perils. Indian J Dermatol Venereol Leprol 2024; 90:849. [PMID: 38594996 DOI: 10.25259/ijdvl_533_2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Accepted: 10/04/2023] [Indexed: 04/11/2024]
Affiliation(s)
- Hitaishi Mehta
- Department of Dermatology, Venereology and Leprology, Post Graduate Institute of Medical Education and Research, Chandigarh, India
| | - Anuradha Bishnoi
- Department of Dermatology, Venereology and Leprology, Post Graduate Institute of Medical Education and Research, Chandigarh, India
| | - Ashwini Reddy
- Anaesthesia and Intensive Care, Post Graduate Institute of Medical Education and Research, Chandigarh, India
| | - Keshavamurthy Vinay
- Department of Dermatology, Venereology and Leprology, Post Graduate Institute of Medical Education and Research, Chandigarh, India
| |
Collapse
|
37
|
Kron P, Farid S, Ali S, Lodge P. Artificial Intelligence: A Help or Hindrance to Scientific Writing? Ann Surg 2024; 280:713-718. [PMID: 39087343 DOI: 10.1097/sla.0000000000006464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/02/2024]
Abstract
We have assessed the chatbot Generative Pretrained Transformer, a type of artificial intelligence software designed to simulate conversations with human users, in an experiment designed to test its relevance to scientific writing. chatbot Generative Pretrained Transformer could become a promising and powerful tool for tasks such as automated draft generation, which may be useful in academic activities to make writing work faster and easier. However, the use of this tool in scientific writing raises some ethical concerns and therefore there have been calls for it to be regulated. It may be difficult to recognize whether an abstract or paper is written by a chatbot or a human being because chatbots use advanced techniques, such as natural language processing and machine learning, to generate text that is similar to human writing. To detect the author is a complex task and requires thorough critical reading to reach a conclusion. The aim of this paper is, therefore, to explore the pros and cons of the use of chatbots in scientific writing.
Collapse
Affiliation(s)
- Philipp Kron
- HPB and Transplant Unit, St. James's University Hospital, Leeds Teaching Hospitals NHS Trust, Leeds, United Kingdom
- Department for General and Transplantation Surgery, University Hospital Tuebingen, Tuebingen, Germany
| | - Shahid Farid
- HPB and Transplant Unit, St. James's University Hospital, Leeds Teaching Hospitals NHS Trust, Leeds, United Kingdom
| | - Sharib Ali
- Faculty of Engineering and Physical Sciences, School of Computing, University of Leeds, Leeds, United Kingdom
| | - Peter Lodge
- HPB and Transplant Unit, St. James's University Hospital, Leeds Teaching Hospitals NHS Trust, Leeds, United Kingdom
| |
Collapse
|
38
|
Carnino JM, Chong NYK, Bayly H, Salvati LR, Tiwana HS, Levi JR. AI-generated text in otolaryngology publications: a comparative analysis before and after the release of ChatGPT. Eur Arch Otorhinolaryngol 2024; 281:6141-6146. [PMID: 39014250 PMCID: PMC11513233 DOI: 10.1007/s00405-024-08834-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 07/08/2024] [Indexed: 07/18/2024]
Abstract
PURPOSE This study delves into the broader implications of artificial intelligence (AI) text generation technologies, including large language models (LLMs) and chatbots, on the scientific literature of otolaryngology. By observing trends in AI-generated text within published otolaryngology studies, this investigation aims to contextualize the impact of AI-driven tools that are reshaping scientific writing and communication. METHODS Text from 143 original articles published in JAMA Otolaryngology - Head and Neck Surgery was collected, representing periods before and after ChatGPT's release in November 2022. The text from each article's abstract, introduction, methods, results, and discussion were entered into ZeroGPT.com to estimate the percentage of AI-generated content. Statistical analyses, including T-Tests and Fligner-Killeen's tests, were conducted using R. RESULTS A significant increase was observed in the mean percentage of AI-generated text post-ChatGPT release, especially in the abstract (from 34.36 to 46.53%, p = 0.004), introduction (from 32.43 to 45.08%, p = 0.010), and discussion sections (from 15.73 to 25.03%, p = 0.015). Publications of authors from non-English speaking countries demonstrated a higher percentage of AI-generated text. CONCLUSION This study found that the advent of ChatGPT has significantly impacted writing practices among researchers publishing in JAMA Otolaryngology - Head and Neck Surgery, raising concerns over the accuracy of AI-created content and potential misinformation risks. This manuscript highlights the evolving dynamics between AI technologies, scientific communication, and publication integrity, emphasizing the urgent need for continued research in this dynamic field. The findings also suggest an increasing reliance on AI tools like ChatGPT, raising questions about their broader implications for scientific publishing.
Collapse
Affiliation(s)
- Jonathan M Carnino
- Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA.
| | - Nicholas Y K Chong
- Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | - Henry Bayly
- Boston University School of Public Health, Boston, MA, USA
| | | | - Hardeep S Tiwana
- Washington State University Elson S. Floyd College of Medicine, Spokane, WA, USA
| | - Jessica R Levi
- Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
- Department of Otolaryngology - Head and Neck Surgery, Boston Medical Center, Boston, MA, USA
| |
Collapse
|
39
|
Singh S, Kumar R, Maharshi V, Singh PK, Kumari V, Tiwari M, Harsha D. Harnessing Artificial Intelligence for Advancing Medical Manuscript Composition: Applications and Ethical Considerations. Cureus 2024; 16:e71744. [PMID: 39552958 PMCID: PMC11569240 DOI: 10.7759/cureus.71744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/17/2024] [Indexed: 11/19/2024] Open
Abstract
Scientific medical manuscripts are fundamental to advancing research and enhancing patient care. With the emergence of artificial intelligence (AI), the process of composing such manuscripts has witnessed profound transformations. This review delves into the multifaceted role of AI in medical manuscript composition, analyzing its applications, benefits, drawbacks, and ethical implications. Employing a comprehensive narrative review methodology, we explored databases such as PubMed, Google Scholar, and Science Direct. The review charts the evolution of AI in medical writing, from basic word processing to sophisticated neural network-based models like GPT-3 and GPT-4. Various AI-powered tools such as ChatGPT, Google Bard, Elicit, and Consensus AI are examined in terms of their functionalities and contributions to research and medical writing. While AI technologies offer notable advantages in automating content creation and boosting research productivity, concerns persist regarding overreliance, potential homogenization of writing styles, and ethical considerations such as originality and authorship. Because of this concern, some companies are restricting the use of AI in peer review processes, medical examinations, etc. It is crucial to strike a balance in integrating AI tools, ensuring human oversight, conducting thorough algorithm audits, addressing financial implications, and upholding academic integrity. The review underscores the transformative potential of AI in medical manuscript composition while emphasizing the ongoing significance of human expertise, creativity, and ethical responsibility in scientific communication. Recommendations are provided for the effective integration of AI tools into medical writing processes, emphasizing collaborative efforts between AI developers, researchers, and journal editors to navigate ethical dilemmas and maximize the benefits of AI-driven advancements in scientific publishing.
Collapse
Affiliation(s)
- Shruti Singh
- Pharmacology, All India Institute of Medical Sciences, Patna, Patna, IND
| | - Rajesh Kumar
- Pharmacology, All India Institute of Medical Sciences, Patna, Patna, IND
| | - Vikas Maharshi
- Pharmacology, All India Institute of Medical Sciences, Patna, Patna, IND
| | - Prashant K Singh
- General Surgery, All India Institute of Medical Sciences, Patna, Patna, IND
| | - Veena Kumari
- Plastic Surgery, All India Institute of Medical Sciences, Patna, Patna, IND
| | - Meenakshi Tiwari
- Lab Medicine, All India Institute of Medical Sciences, Patna, Patna, IND
| | - Divya Harsha
- Pharmacology, All India Institute of Medical Sciences, Patna, Patna, IND
| |
Collapse
|
40
|
Yeo-Teh NSL, Tang BL. Letter to editor: NLP systems such as ChatGPT cannot be listed as an author because these cannot fulfill widely adopted authorship criteria. Account Res 2024; 31:968-970. [PMID: 36748354 DOI: 10.1080/08989621.2023.2177160] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Accepted: 02/02/2023] [Indexed: 02/08/2023]
Abstract
This letter to the editor suggests adding a technical point to the new editorial policy expounded by Hosseini et al. on the mandatory disclosure of any use of natural language processing (NLP) systems, or generative AI, in writing scholarly publications. Such AI systems should naturally also be forbidden from being named as authors, because they would not have fulfilled prevailing authorship guidelines (such as the widely adopted ICMJE authorship criteria).
Collapse
Affiliation(s)
- Nicole Shu Ling Yeo-Teh
- Research Compliance and Integrity Office, National University of Singapore, Singapore, Singapore
| | - Bor Luen Tang
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University Health System, Singapore, Singapore
| |
Collapse
|
41
|
Arun G, Perumal V, Urias FPJB, Ler YE, Tan BWT, Vallabhajosyula R, Tan E, Ng O, Ng KB, Mogali SR. ChatGPT versus a customized AI chatbot (Anatbuddy) for anatomy education: A comparative pilot study. ANATOMICAL SCIENCES EDUCATION 2024; 17:1396-1405. [PMID: 39169464 DOI: 10.1002/ase.2502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 07/30/2024] [Accepted: 07/31/2024] [Indexed: 08/23/2024]
Abstract
Large Language Models (LLMs) have the potential to improve education by personalizing learning. However, ChatGPT-generated content has been criticized for sometimes producing false, biased, and/or hallucinatory information. To evaluate AI's ability to return clear and accurate anatomy information, this study generated a custom interactive and intelligent chatbot (Anatbuddy) through an Open AI Application Programming Interface (API) that enables seamless AI-driven interactions within a secured cloud infrastructure. Anatbuddy was programmed through a Retrieval Augmented Generation (RAG) method to provide context-aware responses to user queries based on a predetermined knowledge base. To compare their outputs, various queries (i.e., prompts) on thoracic anatomy (n = 18) were fed into Anatbuddy and ChatGPT 3.5. A panel comprising three experienced anatomists evaluated both tools' responses for factual accuracy, relevance, completeness, coherence, and fluency on a 5-point Likert scale. These ratings were reviewed by a third party blinded to the study, who revised and finalized scores as needed. Anatbuddy's factual accuracy (mean ± SD = 4.78/5.00 ± 0.43; median = 5.00) was rated significantly higher (U = 84, p = 0.01) than ChatGPT's accuracy (4.11 ± 0.83; median = 4.00). No statistically significant differences were detected between the chatbots for the other variables. Given ChatGPT's current content knowledge limitations, we strongly recommend the anatomy profession develop a custom AI chatbot for anatomy education utilizing a carefully curated knowledge base to ensure accuracy. Further research is needed to determine students' acceptance of custom chatbots for anatomy education and their influence on learning experiences and outcomes.
Collapse
Affiliation(s)
- Gautham Arun
- Lee Kong Chian School of Medicine, Nanyang Technological University Singapore, Singapore, Singapore
- Singapore Polytechnic, Singapore, Singapore
| | - Vivek Perumal
- Lee Kong Chian School of Medicine, Nanyang Technological University Singapore, Singapore, Singapore
| | | | - Yan En Ler
- Singapore Polytechnic, Singapore, Singapore
| | | | | | - Emmanuel Tan
- Lee Kong Chian School of Medicine, Nanyang Technological University Singapore, Singapore, Singapore
| | - Olivia Ng
- Lee Kong Chian School of Medicine, Nanyang Technological University Singapore, Singapore, Singapore
| | - Kian Bee Ng
- Lee Kong Chian School of Medicine, Nanyang Technological University Singapore, Singapore, Singapore
| | | |
Collapse
|
42
|
Oeding JF, Lu AZ, Mazzucco M, Fu MC, Dines DM, Warren RF, Gulotta LV, Dines JS, Kunze KN. Effectiveness of a large language model for clinical information retrieval regarding shoulder arthroplasty. J Exp Orthop 2024; 11:e70114. [PMID: 39691559 PMCID: PMC11649951 DOI: 10.1002/jeo2.70114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/13/2024] [Revised: 10/27/2024] [Accepted: 10/30/2024] [Indexed: 12/19/2024] Open
Abstract
Purpose To determine the scope and accuracy of medical information provided by ChatGPT-4 in response to clinical queries concerning total shoulder arthroplasty (TSA), and to compare these results to those of the Google search engine. Methods A patient-replicated query for 'total shoulder replacement' was performed using both Google Web Search (the most frequently used search engine worldwide) and ChatGPT-4. The top 10 frequently asked questions (FAQs), answers, and associated sources were extracted. This search was performed again independently to identify the top 10 FAQs necessitating numerical responses such that the concordance of answers could be compared between Google and ChatGPT-4. The clinical relevance and accuracy of the provided information were graded by two blinded orthopaedic shoulder surgeons. Results Concerning FAQs with numeric responses, 8 out of 10 (80%) had identical answers or substantial overlap between ChatGPT-4 and Google. Accuracy of information was not significantly different (p = 0.32). Google sources included 40% medical practices, 30% academic, 20% single-surgeon practice, and 10% social media, while ChatGPT-4 used 100% academic sources, representing a statistically significant difference (p = 0.001). Only 3 out of 10 (30%) FAQs with open-ended answers were identical between ChatGPT-4 and Google. The clinical relevance of FAQs was not significantly different (p = 0.18). Google sources for open-ended questions included academic (60%), social media (20%), medical practice (10%) and single-surgeon practice (10%), while 100% of sources for ChatGPT-4 were academic, representing a statistically significant difference (p = 0.0025). Conclusion ChatGPT-4 provided trustworthy academic sources for medical information retrieval concerning TSA, while sources used by Google were heterogeneous. Accuracy and clinical relevance of information were not significantly different between ChatGPT-4 and Google. Level of Evidence Level IV cross-sectional.
Collapse
Affiliation(s)
- Jacob F. Oeding
- Department of Orthopaedics, Institute of Clinical Sciences, The Sahlgrenska AcademyUniversity of GothenburgGothenburgSweden
| | - Amy Z. Lu
- Weill Cornell Medical CollegeNew YorkNew YorkUSA
| | | | - Michael C. Fu
- Department of Orthopaedic SurgeryHospital for Special SurgeryNew YorkNew YorkUSA
- Sports Medicine and Shoulder InstituteHospital for Special SurgeryNew YorkNew YorkUSA
| | - David M. Dines
- Department of Orthopaedic SurgeryHospital for Special SurgeryNew YorkNew YorkUSA
- Sports Medicine and Shoulder InstituteHospital for Special SurgeryNew YorkNew YorkUSA
| | - Russell F. Warren
- Department of Orthopaedic SurgeryHospital for Special SurgeryNew YorkNew YorkUSA
- Sports Medicine and Shoulder InstituteHospital for Special SurgeryNew YorkNew YorkUSA
| | - Lawrence V. Gulotta
- Department of Orthopaedic SurgeryHospital for Special SurgeryNew YorkNew YorkUSA
- Sports Medicine and Shoulder InstituteHospital for Special SurgeryNew YorkNew YorkUSA
| | - Joshua S. Dines
- Department of Orthopaedic SurgeryHospital for Special SurgeryNew YorkNew YorkUSA
- Sports Medicine and Shoulder InstituteHospital for Special SurgeryNew YorkNew YorkUSA
| | - Kyle N. Kunze
- Department of Orthopaedic SurgeryHospital for Special SurgeryNew YorkNew YorkUSA
- Sports Medicine and Shoulder InstituteHospital for Special SurgeryNew YorkNew YorkUSA
| |
Collapse
|
43
|
Filetti S, Fenza G, Gallo A. Research design and writing of scholarly articles: new artificial intelligence tools available for researchers. Endocrine 2024; 85:1104-1116. [PMID: 39085566 DOI: 10.1007/s12020-024-03977-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Accepted: 07/22/2024] [Indexed: 08/02/2024]
|
44
|
Suleiman A, von Wedel D, Munoz-Acuna R, Redaelli S, Santarisi A, Seibold EL, Ratajczak N, Kato S, Said N, Sundar E, Goodspeed V, Schaefer MS. Assessing ChatGPT's ability to emulate human reviewers in scientific research: A descriptive and qualitative approach. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 254:108313. [PMID: 38954915 DOI: 10.1016/j.cmpb.2024.108313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 06/20/2024] [Accepted: 06/27/2024] [Indexed: 07/04/2024]
Abstract
BACKGROUND ChatGPT is an AI platform whose relevance in the peer review of scientific articles is steadily growing. Nonetheless, it has sparked debates over its potential biases and inaccuracies. This study aims to assess ChatGPT's ability to qualitatively emulate human reviewers in scientific research. METHODS We included the first submitted version of the latest twenty original research articles published by the 3rd of July 2023, in a high-profile medical journal. Each article underwent evaluation by a minimum of three human reviewers during the initial review stage. Subsequently, three researchers with medical backgrounds and expertise in manuscript revision, independently and qualitatively assessed the agreement between the peer reviews generated by ChatGPT version GPT-4 and the comments provided by human reviewers for these articles. The level of agreement was categorized into complete, partial, none, or contradictory. RESULTS 720 human reviewers' comments were assessed. There was a good agreement between the three assessors (Overall kappa >0.6). ChatGPT's comments demonstrated complete agreement in terms of quality and substance with 48 (6.7 %) human reviewers' comments, partially agreed with 92 (12.8 %), identifying issues necessitating further elaboration or recommending supplementary steps to address concerns, had no agreement with a significant 565 (78.5 %), and contradicted 15 (2.1 %). ChatGPT comments on methods had the lowest proportion of complete agreement (13 comments, 3.6 %), while general comments on the manuscript displayed the highest proportion of complete agreement (17 comments, 22.1 %). CONCLUSION ChatGPT version GPT-4 has a limited ability to emulate human reviewers within the peer review process of scientific research.
Collapse
Affiliation(s)
- Aiman Suleiman
- Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Department of Anesthesia, Critical Care and Pain Medicine, Albert Einstein College of Medicine, Montefiore Medical Center, Bronx, NY, USA.
| | - Dario von Wedel
- Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Ricardo Munoz-Acuna
- Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Simone Redaelli
- Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Abeer Santarisi
- Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Department of Emergency Medicine, Disaster Medicine Fellowship, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Eva-Lotte Seibold
- Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Nikolai Ratajczak
- Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Shinichiro Kato
- Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Nader Said
- Department of Industrial Engineering, Faculty of Engineering Technologies and Sciences, Higher Colleges of Technology, DWC, Dubai, United Arab Emirates
| | - Eswar Sundar
- Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Valerie Goodspeed
- Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Maximilian S Schaefer
- Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Klinik für Anästhesiologie, Universitätsklinikum Düsseldorf, Düsseldorf, Germany
| |
Collapse
|
45
|
Ramoni D, Sgura C, Liberale L, Montecucco F, Ioannidis JPA, Carbone F. Artificial intelligence in scientific medical writing: Legitimate and deceptive uses and ethical concerns. Eur J Intern Med 2024; 127:31-35. [PMID: 39048335 DOI: 10.1016/j.ejim.2024.07.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 06/14/2024] [Accepted: 07/10/2024] [Indexed: 07/27/2024]
Abstract
The debate surrounding the integration of artificial intelligence (AI) into scientific writing has already attracted significant interest in medical and life sciences. While AI can undoubtedly expedite the process of manuscript creation and correction, it raises several criticisms. The crossover between AI and health sciences is relatively recent, but the use of AI tools among physicians and other scientists who work in the life sciences is growing very fast. Within this whirlwind, it is becoming essential to realize where we are heading and what the limits are, including an ethical perspective. Modern conversational AIs exhibit a context awareness that enables them to understand and remember any conversation beyond any predefined script. Even more impressively, they can learn and adapt as they engage with a growing volume of human language input. They all share neural networks as background mathematical models and differ from old chatbots for their use of a specific network architecture called transformer model [1]. Some of them exceed 100 terabytes (TB) (e.g., Bloom, LaMDA) or even 500 TB (e.g., Megatron-Turing NLG) of text data, the 4.0 version of ChatGPT (GPT-4) was trained with nearly 45 TB, but stays updated by the internet connection and may integrate with different plugins that enhance its functionality, making it multimodal.
Collapse
Affiliation(s)
- Davide Ramoni
- Department of Internal Medicine, University of Genoa, 6 viale Benedetto XV, 16132 Genoa, Italy
| | - Cosimo Sgura
- Department of Internal Medicine, University of Genoa, 6 viale Benedetto XV, 16132 Genoa, Italy
| | - Luca Liberale
- Department of Internal Medicine, University of Genoa, 6 viale Benedetto XV, 16132 Genoa, Italy; IRCCS Ospedale Policlinico San Martino, Genoa - Italian Cardiovascular Network, Largo Rosanna Benzi 10, 16132, Genoa, Italy
| | - Fabrizio Montecucco
- Department of Internal Medicine, University of Genoa, 6 viale Benedetto XV, 16132 Genoa, Italy; IRCCS Ospedale Policlinico San Martino, Genoa - Italian Cardiovascular Network, Largo Rosanna Benzi 10, 16132, Genoa, Italy
| | - John P A Ioannidis
- Departments of Medicine of Epidemiology and Population Health of Biomedical Science and of Statistics and Meta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford CA 94305, USA
| | - Federico Carbone
- Department of Internal Medicine, University of Genoa, 6 viale Benedetto XV, 16132 Genoa, Italy; IRCCS Ospedale Policlinico San Martino, Genoa - Italian Cardiovascular Network, Largo Rosanna Benzi 10, 16132, Genoa, Italy.
| |
Collapse
|
46
|
Pividori M, Greene CS. A publishing infrastructure for Artificial Intelligence (AI)-assisted academic authoring. J Am Med Inform Assoc 2024; 31:2103-2113. [PMID: 38879443 PMCID: PMC11339502 DOI: 10.1093/jamia/ocae139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 05/23/2024] [Accepted: 05/29/2024] [Indexed: 06/25/2024] Open
Abstract
OBJECTIVE Investigate the use of advanced natural language processing models to streamline the time-consuming process of writing and revising scholarly manuscripts. MATERIALS AND METHODS For this purpose, we integrate large language models into the Manubot publishing ecosystem to suggest revisions for scholarly texts. Our AI-based revision workflow employs a prompt generator that incorporates manuscript metadata into templates, generating section-specific instructions for the language model. The model then generates revised versions of each paragraph for human authors to review. We evaluated this methodology through 5 case studies of existing manuscripts, including the revision of this manuscript. RESULTS Our results indicate that these models, despite some limitations, can grasp complex academic concepts and enhance text quality. All changes to the manuscript are tracked using a version control system, ensuring transparency in distinguishing between human- and machine-generated text. CONCLUSIONS Given the significant time researchers invest in crafting prose, incorporating large language models into the scholarly writing process can significantly improve the type of knowledge work performed by academics. Our approach also enables scholars to concentrate on critical aspects of their work, such as the novelty of their ideas, while automating tedious tasks like adhering to specific writing styles. Although the use of AI-assisted tools in scientific authoring is controversial, our approach, which focuses on revising human-written text and provides change-tracking transparency, can mitigate concerns regarding AI's role in scientific writing.
Collapse
Affiliation(s)
- Milton Pividori
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO 80045, United States
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Casey S Greene
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO 80045, United States
- Center for Health AI, Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO 80045, United States
| |
Collapse
|
47
|
Xu T, Weng H, Liu F, Yang L, Luo Y, Ding Z, Wang Q. Current Status of ChatGPT Use in Medical Education: Potentials, Challenges, and Strategies. J Med Internet Res 2024; 26:e57896. [PMID: 39196640 PMCID: PMC11391159 DOI: 10.2196/57896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 06/05/2024] [Accepted: 06/29/2024] [Indexed: 08/29/2024] Open
Abstract
ChatGPT, a generative pretrained transformer, has garnered global attention and sparked discussions since its introduction on November 30, 2022. However, it has generated controversy within the realms of medical education and scientific research. This paper examines the potential applications, limitations, and strategies for using ChatGPT. ChatGPT offers personalized learning support to medical students through its robust natural language generation capabilities, enabling it to furnish answers. Moreover, it has demonstrated significant use in simulating clinical scenarios, facilitating teaching and learning processes, and revitalizing medical education. Nonetheless, numerous challenges accompany these advancements. In the context of education, it is of paramount importance to prevent excessive reliance on ChatGPT and combat academic plagiarism. Likewise, in the field of medicine, it is vital to guarantee the timeliness, accuracy, and reliability of content generated by ChatGPT. Concurrently, ethical challenges and concerns regarding information security arise. In light of these challenges, this paper proposes targeted strategies for addressing them. First, the risk of overreliance on ChatGPT and academic plagiarism must be mitigated through ideological education, fostering comprehensive competencies, and implementing diverse evaluation criteria. The integration of contemporary pedagogical methodologies in conjunction with the use of ChatGPT serves to enhance the overall quality of medical education. To enhance the professionalism and reliability of the generated content, it is recommended to implement measures to optimize ChatGPT's training data professionally and enhance the transparency of the generation process. This ensures that the generated content is aligned with the most recent standards of medical practice. Moreover, the enhancement of value alignment and the establishment of pertinent legislation or codes of practice address ethical concerns, including those pertaining to algorithmic discrimination, the allocation of medical responsibility, privacy, and security. In conclusion, while ChatGPT presents significant potential in medical education, it also encounters various challenges. Through comprehensive research and the implementation of suitable strategies, it is anticipated that ChatGPT's positive impact on medical education will be harnessed, laying the groundwork for advancing the discipline and fostering the development of high-caliber medical professionals.
Collapse
Affiliation(s)
- Tianhui Xu
- Clinical Nursing Teaching and Research Section, The Second Xiangya Hospital of Central South University, Changsha, China
- Xiangya School of Nursing, Central South University, Changsha, China
| | - Huiting Weng
- Clinical Nursing Teaching and Research Section, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Fang Liu
- Clinical Nursing Teaching and Research Section, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Li Yang
- Clinical Nursing Teaching and Research Section, The Second Xiangya Hospital of Central South University, Changsha, China
| | - Yuanyuan Luo
- Xiangya School of Nursing, Central South University, Changsha, China
| | - Ziwei Ding
- Xiangya School of Nursing, Central South University, Changsha, China
| | - Qin Wang
- Clinical Nursing Teaching and Research Section, The Second Xiangya Hospital of Central South University, Changsha, China
- Xiangya School of Nursing, Central South University, Changsha, China
| |
Collapse
|
48
|
Kocak Z. Publication Ethics in the Era of Artificial Intelligence. J Korean Med Sci 2024; 39:e249. [PMID: 39189714 PMCID: PMC11347185 DOI: 10.3346/jkms.2024.39.e249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Accepted: 07/17/2024] [Indexed: 08/28/2024] Open
Abstract
The application of new technologies, such as artificial intelligence (AI), to science affects the way and methodology in which research is conducted. While the responsible use of AI brings many innovations and benefits to science and humanity, its unethical use poses a serious threat to scientific integrity and literature. Even in the absence of malicious use, the Chatbot output itself, as a software application based on AI, carries the risk of containing biases, distortions, irrelevancies, misrepresentations and plagiarism. Therefore, the use of complex AI algorithms raises concerns about bias, transparency and accountability, requiring the development of new ethical rules to protect scientific integrity. Unfortunately, the development and writing of ethical codes cannot keep up with the pace of development and implementation of technology. The main purpose of this narrative review is to inform readers, authors, reviewers and editors about new approaches to publication ethics in the era of AI. It specifically focuses on tips on how to disclose the use of AI in your manuscript, how to avoid publishing entirely AI-generated text, and current standards for retraction.
Collapse
Affiliation(s)
- Zafer Kocak
- Department of Radiation Oncology, Trakya University School of Medicine, Edirne, Türkiye.
| |
Collapse
|
49
|
Fatima A, Shafique MA, Alam K, Fadlalla Ahmed TK, Mustafa MS. ChatGPT in medicine: A cross-disciplinary systematic review of ChatGPT's (artificial intelligence) role in research, clinical practice, education, and patient interaction. Medicine (Baltimore) 2024; 103:e39250. [PMID: 39121303 PMCID: PMC11315549 DOI: 10.1097/md.0000000000039250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 07/19/2024] [Indexed: 08/11/2024] Open
Abstract
BACKGROUND ChatGPT, a powerful AI language model, has gained increasing prominence in medicine, offering potential applications in healthcare, clinical decision support, patient communication, and medical research. This systematic review aims to comprehensively assess the applications of ChatGPT in healthcare education, research, writing, patient communication, and practice while also delineating potential limitations and areas for improvement. METHOD Our comprehensive database search retrieved relevant papers from PubMed, Medline and Scopus. After the screening process, 83 studies met the inclusion criteria. This review includes original studies comprising case reports, analytical studies, and editorials with original findings. RESULT ChatGPT is useful for scientific research and academic writing, and assists with grammar, clarity, and coherence. This helps non-English speakers and improves accessibility by breaking down linguistic barriers. However, its limitations include probable inaccuracy and ethical issues, such as bias and plagiarism. ChatGPT streamlines workflows and offers diagnostic and educational potential in healthcare but exhibits biases and lacks emotional sensitivity. It is useful in inpatient communication, but requires up-to-date data and faces concerns about the accuracy of information and hallucinatory responses. CONCLUSION Given the potential for ChatGPT to transform healthcare education, research, and practice, it is essential to approach its adoption in these areas with caution due to its inherent limitations.
Collapse
Affiliation(s)
- Afia Fatima
- Department of Medicine, Jinnah Sindh Medical University, Karachi, Pakistan
| | | | - Khadija Alam
- Department of Medicine, Liaquat National Medical College, Karachi, Pakistan
| | | | | |
Collapse
|
50
|
Abukhadijah HJ, Nashwan AJ. Transforming Hospital Quality Improvement Through Harnessing the Power of Artificial Intelligence. GLOBAL JOURNAL ON QUALITY AND SAFETY IN HEALTHCARE 2024; 7:132-139. [PMID: 39104802 PMCID: PMC11298043 DOI: 10.36401/jqsh-24-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 04/28/2024] [Accepted: 05/01/2024] [Indexed: 08/07/2024]
Abstract
This policy analysis focuses on harnessing the power of artificial intelligence (AI) in hospital quality improvement to transform quality and patient safety. It examines the application of AI at the two following fundamental levels: (1) diagnostic and treatment and (2) clinical operations. AI applications in diagnostics directly impact patient care and safety. At the same time, AI indirectly influences patient safety at the clinical operations level by streamlining (1) operational efficiency, (2) risk assessment, (3) predictive analytics, (4) quality indicators reporting, and (5) staff training and education. The challenges and future perspectives of AI application in healthcare, encompassing technological, ethical, and other considerations, are also critically analyzed.
Collapse
Affiliation(s)
| | - Abdulqadir J. Nashwan
- Nursing & Midwifery Research Department, Hamad Medical Corporation, Doha, Qatar
- Department of Public Health, College of Health Sciences, QU Health, Qatar University, Doha, Qatar
| |
Collapse
|