1
|
Kasakewitch JPG, Lima DL, Balthazar da Silveira CA, Sanha V, Rasador AC, Cavazzola LT, Mayol J, Malcher F. The Role of Artificial Intelligence Large Language Models in Literature Search Assistance to Evaluate Inguinal Hernia Repair Approaches. J Laparoendosc Adv Surg Tech A 2025. [PMID: 40285461 DOI: 10.1089/lap.2024.0277] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2025] Open
Abstract
Aim: This study assesses the reliability of artificial intelligence (AI) large language models (LLMs) in identifying relevant literature comparing inguinal hernia repair techniques. Material and Methods: We used LLM chatbots (Bing Chat AI, ChatGPT versions 3.5 and 4.0, and Gemini) to find comparative studies and randomized controlled trials on inguinal hernia repair techniques. The results were then compared with existing systematic reviews (SRs) and meta-analyses and checked for the authenticity of listed articles. Results: LLMs screened 22 studies from 2006 to 2023 across eight journals, while the SRs encompassed a total of 42 studies. Through thorough external validation, 63.6% of the studies (14 out of 22), including 10 identified through Chat GPT 4.0 and 6 via Bing AI (with an overlap of 2 studies between them), were confirmed to be authentic. Conversely, 36.3% (8 out of 22) were revealed as fabrications by Google Gemini (Bard), with two (25.0%) of these fabrications mistakenly linked to valid DOIs. Four (25.6%) of the 14 real studies were acknowledged in the SRs, which represents 18.1% of all LLM-generated studies. LLMs missed a total of 38 (90.5%) of the studies included in the previous SRs, while 10 real studies were found by the LLMs but were not included in the previous SRs. Between those 10 studies, 6 were reviews, and 1 was published after the SRs, leaving a total of three comparative studies missed by the reviews. Conclusions: This study reveals the mixed reliability of AI language models in scientific searches. Emphasizing a cautious application of AI in academia and the importance of continuous evaluation of AI tools in scientific investigations.
Collapse
Affiliation(s)
- Joao P G Kasakewitch
- Department of Surgery, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, USA
| | - Diego L Lima
- Department of Surgery, Montefiore Medical Center, The Bronx, New York, USA
| | | | - Valberto Sanha
- Department of Surgery, Universidade Federal de Ciências da Saúde de Porto Alegre, Porto Alegre, Brasil
| | | | | | - Julio Mayol
- Hospital Clínico San Carlos, IdISSC, Universidad Complutense de Madrid, Madrid, Spain
| | - Flavio Malcher
- Division of General Surgery, NYU Langone Health, New York, New York, USA
| |
Collapse
|
2
|
Al-Rawas M, Qader OAJA, Othman NH, Ismail NH, Mamat R, Halim MS, Abdullah JY, Noorani TY. Identification of dental related ChatGPT generated abstracts by senior and young academicians versus artificial intelligence detectors and a similarity detector. Sci Rep 2025; 15:11275. [PMID: 40175423 PMCID: PMC11965432 DOI: 10.1038/s41598-025-95387-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2024] [Accepted: 03/20/2025] [Indexed: 04/04/2025] Open
Abstract
Several researchers have investigated the consequences of using ChatGPT in the education industry. Their findings raised doubts regarding the probable effects that ChatGPT may have on the academia. As such, the present study aimed to assess the ability of three methods, namely: (1) academicians (senior and young), (2) three AI detectors (GPT-2 output detector, Writefull GPT detector, and GPTZero) and (3) one plagiarism detector, to differentiate between human- and ChatGPT-written abstracts. A total of 160 abstracts were assessed by those three methods. Two senior and two young academicians used a newly developed rubric to assess the type and quality of 80 human-written and 80 ChatGPT-written abstracts. The results were statistically analysed using crosstabulation and chi-square analysis. Bivariate correlation and accuracy of the methods were assessed. The findings demonstrated that all the three methods made a different variety of incorrect assumptions. The level of the academician experience may play a role in the detection ability with senior academician 1 demonstrating superior accuracy. GPTZero AI and similarity detectors were very good at accurately identifying the abstracts origin. In terms of abstract type, every variable positively correlated, except in the case of similarity detectors (p < 0.05). Human-AI collaborations may significantly benefit the identification of the abstract origins.
Collapse
Affiliation(s)
- Matheel Al-Rawas
- Prosthodontic Unit, School of Dental Sciences, Universiti Sains Malaysia, Health Campus, Kubang Kerian, Kota Bharu, Kelantan, Malaysia
- Hospital Pakar Universiti Sains Malaysia, Kubang Kerian, Kota Bharu, Kelantan, Malaysia
| | | | - Nurul Hanim Othman
- Prosthodontic Unit, School of Dental Sciences, Universiti Sains Malaysia, Health Campus, Kubang Kerian, Kota Bharu, Kelantan, Malaysia
- Hospital Pakar Universiti Sains Malaysia, Kubang Kerian, Kota Bharu, Kelantan, Malaysia
| | - Noor Huda Ismail
- Prosthodontic Unit, School of Dental Sciences, Universiti Sains Malaysia, Health Campus, Kubang Kerian, Kota Bharu, Kelantan, Malaysia
- Hospital Pakar Universiti Sains Malaysia, Kubang Kerian, Kota Bharu, Kelantan, Malaysia
| | - Rosnani Mamat
- Hospital Pakar Universiti Sains Malaysia, Kubang Kerian, Kota Bharu, Kelantan, Malaysia
- Conservative Dentistry Unit, School of Dental Sciences, Universiti Sains Malaysia, Health Campus, Kubang Kerian, Kota Bharu, Kelantan, Malaysia
| | - Mohamad Syahrizal Halim
- Hospital Pakar Universiti Sains Malaysia, Kubang Kerian, Kota Bharu, Kelantan, Malaysia
- Conservative Dentistry Unit, School of Dental Sciences, Universiti Sains Malaysia, Health Campus, Kubang Kerian, Kota Bharu, Kelantan, Malaysia
| | - Johari Yap Abdullah
- Craniofacial Imaging Laboratory, School of Dental Sciences, Universiti Sains Malaysia, Health Campus, 16150 Kubang Kerian, Kota Bharu, Kelantan, Malaysia.
- Dental Research Unit, Center for Transdisciplinary Research (CFTR), Saveetha Dental College, Saveetha Institute of Medical and Technical Sciences (SIMATS), Saveetha University, Chennai, Tamil Nadu, India.
| | - Tahir Yusuf Noorani
- Hospital Pakar Universiti Sains Malaysia, Kubang Kerian, Kota Bharu, Kelantan, Malaysia.
- Conservative Dentistry Unit, School of Dental Sciences, Universiti Sains Malaysia, Health Campus, Kubang Kerian, Kota Bharu, Kelantan, Malaysia.
- Saveetha Dental College and Hospitals, Saveetha Institute of Medical and Technical Sciences (SIMATS), Saveetha University, Chennai, Tamil Nadu, India.
| |
Collapse
|
3
|
Cardona G, Argiles M, Pérez-Mañá L. Accuracy of a Large Language Model as a new tool for optometry education. Clin Exp Optom 2025; 108:343-346. [PMID: 38044041 DOI: 10.1080/08164622.2023.2288174] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 08/30/2023] [Accepted: 09/18/2023] [Indexed: 12/05/2023] Open
Abstract
CLINICAL RELEVANCE The unsupervised introduction of certain Artificial Intelligence tools in optometry education may challenge the proper acquisition of accurate clinical knowledge and skills proficiency. BACKGROUND Large Language Models like ChatGPT (Generative Pretrained Transformer) are increasingly being used by researchers and students for work and academic assignments. The authoritative and conversationally correct language provided by these tools may mask their inherent limitations when presented with specific scientific and clinical queries. METHODS Three sets of 10 queries related to contact lenses & anterior eye, low vision and binocular vision & vision therapy were presented to ChatGPT, with instructions to provide five relevant references to support each response. Three experts and 53 undergraduate and post-graduate students graded from 0 to 10 the accuracy of the responses, and the references were evaluated for precision and relevance. Students graded from 0 to 10 the potential usefulness of ChatGPT for their academic coursework. RESULTS Median scores were 7, 8 and 6 (experts) and 8, 9 and 7.5 (students) for the contact lenses & anterior eye, low vision and binocular vision & vision therapy categories, respectively. Responses to more specific queries were awarded lower scores by both experts (ρ = -0.612; P < 0.001) and students (ρ = -0.578; P = 0.001). Of 150 references, 24% were accurate and 19.3% relevant. Students graded the usefulness of ChatGPT with 7.5 (2 to 9), 7 (3 to 9) and 8.5 (3 to 10) for contact lenses & anterior eye, low vision and binocular vision & vision therapy, respectively. CONCLUSION Careful expert appraisal of the responses and, particularly, of the references provided by ChatGPT is required in research and academic settings. As the use of these tools becomes widespread, it is essential to take proactive steps to address their limitations and ensure their responsible use.
Collapse
Affiliation(s)
- Genis Cardona
- Department of Optics and Optometry, Universitat Politècnica de Catalunya, Terrassa, Spain
| | - Marc Argiles
- Department of Optics and Optometry, Universitat Politècnica de Catalunya, Terrassa, Spain
| | - Lluis Pérez-Mañá
- Department of Optics and Optometry, Universitat Politècnica de Catalunya, Terrassa, Spain
| |
Collapse
|
4
|
Hasan SS, Fury MS, Woo JJ, Kunze KN, Ramkumar PN. Ethical Application of Generative Artificial Intelligence in Medicine. Arthroscopy 2025; 41:874-885. [PMID: 39689842 DOI: 10.1016/j.arthro.2024.12.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/09/2024] [Revised: 11/25/2024] [Accepted: 12/03/2024] [Indexed: 12/19/2024]
Abstract
Generative artificial intelligence (AI) may revolutionize health care, providing solutions that range from enhancing diagnostic accuracy to personalizing treatment plans. However, its rapid and largely unregulated integration into medicine raises ethical concerns related to data integrity, patient safety, and appropriate oversight. One of the primary ethical challenges lies in generative AI's potential to produce misleading or fabricated information, posing risks of misdiagnosis or inappropriate treatment recommendations, which underscore the necessity for robust physician oversight. Transparency also remains a critical concern, as the closed-source nature of many large-language models prevents both patients and health care providers from understanding the reasoning behind AI-generated outputs, potentially eroding trust. The lack of regulatory approval for AI as a medical device, combined with concerns around the security of patient-derived data and AI-generated synthetic data, further complicates its safe integration into clinical workflows. Furthermore, synthetic datasets generated by AI, although valuable for augmenting research in areas with scarce data, complicate questions of data ownership, patient consent, and scientific validity. In addition, generative AI's ability to streamline administrative tasks risks depersonalizing care, further distancing providers from patients. These challenges compound the deeper issues plaguing the health care system, including the emphasis of volume and speed over value and expertise. The use of generative AI in medicine brings about mass scaling of synthetic information, thereby necessitating careful adoption to protect patient care and medical advancement. Given these considerations, generative AI applications warrant regulatory and critical scrutiny. Key starting points include establishing strict standards for data security and transparency, implementing oversight akin to institutional review boards to govern data usage, and developing interdisciplinary guidelines that involve developers, clinicians, and ethicists. By addressing these concerns, we can better align generative AI adoption with the core foundations of humanistic health care, preserving patient safety, autonomy, and trust while harnessing AI's transformative potential. LEVEL OF EVIDENCE: Level V, expert opinion.
Collapse
Affiliation(s)
| | - Matthew S Fury
- Baton Rouge Orthopaedic Clinic, Baton Rouge, Louisiana, U.S.A
| | - Joshua J Woo
- Brown University/The Warren Alpert School of Brown University, Providence, Rhode Island, U.S.A
| | - Kyle N Kunze
- Hospital for Special Surgery, New York, New York, U.S.A
| | | |
Collapse
|
5
|
Elek A, Yildiz HS, Akca B, Oren NC, Gundogdu B. Evaluating the Efficacy of Perplexity Scores in Distinguishing AI-Generated and Human-Written Abstracts. Acad Radiol 2025; 32:1785-1790. [PMID: 39915182 DOI: 10.1016/j.acra.2025.01.017] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2024] [Revised: 10/05/2024] [Accepted: 01/14/2025] [Indexed: 04/11/2025]
Abstract
RATIONALE AND OBJECTIVES We aimed to evaluate the efficacy of perplexity scores in distinguishing between human-written and AI-generated radiology abstracts and to assess the relative performance of available AI detection tools in detecting AI-generated content. METHODS Academic articles were curated from PubMed using the keywords "neuroimaging" and "angiography." Filters included English-language, open-access articles with abstracts without subheadings, published before 2021, and within Chatbot processing word limits. The first 50 qualifying articles were selected, and their full texts were used to create AI-generated abstracts. Perplexity scores, which estimate sentence predictability, were calculated for both AI-generated and human-written abstracts. The performance of three AI tools in discriminating human-written from AI-generated abstracts was assessed. RESULTS The selected 50 articles consist of 22 review articles (44%), 12 case or technical reports (24%), 15 research articles (30%), and one editorial (2%). The perplexity scores for human-written abstracts (median; 35.9 IQR; 25.11-51.8) were higher than those for AI-generated abstracts (median; 21.2 IQR; 16.87-28.38), (p=0.057) with an AUC=0.7794. One AI tool performed less than chance in identifying human-written from AI-generated abstracts with an accuracy of 36% (p>0.05) while another tool yielded an accuracy of 95% with an AUC=0.8688. CONCLUSION This study underscores the potential of perplexity scores in detecting AI-generated and potentially fraudulent abstracts. However, more research is needed to further explore these findings and their implications for the use of AI in academic writing. Future studies could also investigate other metrics or methods for distinguishing between human-written and AI-generated texts.
Collapse
Affiliation(s)
- Alperen Elek
- Ege University Faculty of Medicine, Izmir, Turkey (A.E.).
| | | | - Benan Akca
- Marmara University, Electrical-Electronics Engineering Department, Istanbul, Turkey (B.A.)
| | - Nisa Cem Oren
- Advanced Midwest Radiology Illinois, Oak Brook, Illinois (N.C.O.)
| | - Batuhan Gundogdu
- University of Chicago, Department of Radiology, Chicago, Illinois (B.G.)
| |
Collapse
|
6
|
Shiraishi M, Sowa Y, Tomita K, Terao Y, Satake T, Muto M, Morita Y, Higai S, Toyohara Y, Kurokawa Y, Sunaga A, Okazaki M. Performance of Artificial Intelligence Chatbots in Answering Clinical Questions on Japanese Practical Guidelines for Implant-based Breast Reconstruction. Aesthetic Plast Surg 2025; 49:1947-1953. [PMID: 39592492 DOI: 10.1007/s00266-024-04515-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2024] [Accepted: 11/04/2024] [Indexed: 11/28/2024]
Abstract
BACKGROUND Artificial intelligence (AI) chatbots, including ChatGPT-4 (GPT-4) and Grok-1 (Grok), have been shown to be potentially useful in several medical fields, but have not been examined in plastic and aesthetic surgery. The aim of this study is to evaluate the responses of these AI chatbots for clinical questions (CQs) related to the guidelines for implant-based breast reconstruction (IBBR) published by the Japan Society of Plastic and Reconstructive Surgery (JSPRS) in 2021. METHODS CQs in the JSPRS guidelines were used as question sources. Responses from two AI chatbots, GPT-4 and Grok, were evaluated for accuracy, informativeness, and readability by five Japanese Board-certified breast reconstruction specialists and five Japanese clinical fellows of plastic surgery. RESULTS GPT-4 outperformed Grok significantly in terms of accuracy (p < 0.001), informativeness (p < 0.001), and readability (p < 0.001) when evaluated by plastic surgery fellows. Compared to the original guidelines, Grok scored significantly lower in all three areas (all p < 0.001). The accuracy of GPT-4 was rated to be significantly higher based on scores given by plastic surgery fellows compared to those of breast reconstruction specialists (p = 0.012), whereas there was no significant difference between these scores for Grok. CONCLUSIONS The study suggests that GPT-4 has the potential to assist in interpreting and applying clinical guidelines for IBBR but importantly there is still a risk that AI chatbots can misinform. Further studies are needed to understand the broader role of current and future AI chatbots in breast reconstruction surgery. LEVEL OF EVIDENCE IV This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine Ratings, please refer to Table of Contents or online Instructions to Authors www.springer.com/00266 .
Collapse
Affiliation(s)
- Makoto Shiraishi
- Department of Plastic and Reconstructive Surgery, The University of Tokyo Hospital, Tokyo, Japan
| | - Yoshihiro Sowa
- Department of Plastic Surgery, Jichi Medical University, Yakushiji, Shimotsuke, Tochigi, Japan.
| | - Koichi Tomita
- Department of Plastic and Reconstructive Surgery, Kindai University, Osaka, Japan
| | - Yasunobu Terao
- Department of Plastic and Reconstructive Surgery, Tokyo Metropolitan Cancer and Infectious Diseases Center, Komagome Hospital, Tokyo, Japan
| | - Toshihiko Satake
- Department of Plastic, Reconstructive and Aesthetic Surgery, Toyama University Hospital, Toyama, Japan
| | - Mayu Muto
- Department of Plastic, Reconstructive and Aesthetic Surgery, Toyama University Hospital, Toyama, Japan
- Lala Breast Reconstruction Clinic Yokohama, Yokohama, Japan
- Department of Plastic Surgery, Yokohama City University Medical Center, Yokohama, Japan
| | - Yuhei Morita
- Department of Plastic Surgery, Jichi Medical University, Yakushiji, Shimotsuke, Tochigi, Japan
- Japanese Red Cross Koga Hospital, Koga, Japan
| | - Shino Higai
- Department of Plastic Surgery, Jichi Medical University, Yakushiji, Shimotsuke, Tochigi, Japan
| | - Yoshihiro Toyohara
- Department of Plastic Surgery, Jichi Medical University, Yakushiji, Shimotsuke, Tochigi, Japan
| | - Yasue Kurokawa
- Department of Plastic Surgery, Jichi Medical University, Yakushiji, Shimotsuke, Tochigi, Japan
| | - Ataru Sunaga
- Department of Plastic Surgery, Jichi Medical University, Yakushiji, Shimotsuke, Tochigi, Japan
| | - Mutsumi Okazaki
- Department of Plastic and Reconstructive Surgery, The University of Tokyo Hospital, Tokyo, Japan
| |
Collapse
|
7
|
Ding AW, Li S. Generative AI lacks the human creativity to achieve scientific discovery from scratch. Sci Rep 2025; 15:9587. [PMID: 40113940 PMCID: PMC11926073 DOI: 10.1038/s41598-025-93794-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Accepted: 03/10/2025] [Indexed: 03/22/2025] Open
Abstract
Scientists are interested in whether generative artificial intelligence (GenAI) can make scientific discoveries similar to those of humans. However, the results are mixed. Here, we examine whether, how and what scientific discovery GenAI can make in terms of the origin of hypotheses and experimental design through the interpretation of results. With the help of a computer-supported molecular genetic laboratory, GenAI assumes the role of a scientist tasked with investigating a Nobel-worthy scientific discovery in the molecular genetics field. We find that current GenAI can make only incremental discoveries but cannot achieve fundamental discoveries from scratch as humans can. Regarding the origin of the hypothesis, it is unable to generate truly original hypotheses and is incapable of having an epiphany to detect anomalies in experimental results. Therefore, current GenAI is good only at discovery tasks involving either a known representation of the domain knowledge or access to the human scientists' knowledge space. Furthermore, it has the illusion of making a completely successful discovery with overconfidence. We discuss approaches to address the limitations of current GenAI and its ethical concerns and biases in scientific discovery. This research provides insight into the role of GenAI in scientific discovery and general scientific innovation.
Collapse
Affiliation(s)
| | - Shibo Li
- Kelley School of Business, Indiana University Bloomington, Bloomington, IN, USA.
| |
Collapse
|
8
|
Ozkara BB, Boutet A, Comstock BA, Van Goethem J, Huisman TAGM, Ross JS, Saba L, Shah LM, Wintermark M, Castillo M. Artificial Intelligence-Generated Editorials in Radiology: Can Expert Editors Detect Them? AJNR Am J Neuroradiol 2025; 46:559-566. [PMID: 39288967 PMCID: PMC11979811 DOI: 10.3174/ajnr.a8505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Accepted: 09/16/2024] [Indexed: 09/20/2024]
Abstract
BACKGROUND AND PURPOSE Artificial intelligence is capable of generating complex texts that may be indistinguishable from those written by humans. We aimed to evaluate the ability of GPT-4 to write radiology editorials and to compare these with human-written counterparts, thereby determining their real-world applicability for scientific writing. MATERIALS AND METHODS Sixteen editorials from 8 journals were included. To generate the artificial intelligence (AI)-written editorials, the summary of 16 human-written editorials was fed into GPT-4. Six experienced editors reviewed the articles. First, an unpaired approach was used. The raters were asked to evaluate the content of each article by using a 1-5 Likert scale across specified metrics. Then, they determined whether the editorials were written by humans or AI. The articles were then evaluated in pairs to determine which article was generated by AI and which should be published. Finally, the articles were analyzed with an AI detector and for plagiarism. RESULTS The human-written articles had a median AI probability score of 2.0%, whereas the AI-written articles had 58%. The median similarity score among AI-written articles was 3%. Fifty-eight percent of unpaired articles were correctly classified regarding authorship. Rating accuracy was increased to 70% in the paired setting. AI-written articles received slightly higher scores in most metrics. When stratified by perception, human-written perceived articles were rated higher in most categories. In the paired setting, raters strongly preferred publishing the article they perceived as human-written (82%). CONCLUSIONS GPT-4 can write high-quality articles that iThenticate does not flag as plagiarized, which may go undetected by editors, and that detection tools can detect to a limited extent. Editors showed a positive bias toward human-written articles.
Collapse
Affiliation(s)
- Burak Berksu Ozkara
- From the Department of Neuroradiology (B.B.O., M.W.), The University of Texas MD Anderson Center, Houston, Texas
| | - Alexandre Boutet
- Joint Department of Medical Imaging (A.B.), University of Toronto, Toronto, Ontario, Canada
| | - Bryan A Comstock
- Department of Biostatistics (B.A.C.), University of Washington, Seattle, Washington
| | - Johan Van Goethem
- Department of Radiology (J.V.G.), Antwerp University Hospital, Antwerp, Belgium
| | - Thierry A G M Huisman
- Department of Radiology (T.A.G.M.H.), Texas Children's Hospital and Baylor College of Medicine, Houston, Texas
| | - Jeffrey S Ross
- Department of Radiology (J.S.R.), Mayo Clinic Arizona, Phoenix, Arizona
| | - Luca Saba
- Department of Radiology (L.S.), University of Cagliari, Cagliari, Italy
| | - Lubdha M Shah
- Department of Radiology (L.M.S.), University of Utah, Salt Lake City, Utah
| | - Max Wintermark
- From the Department of Neuroradiology (B.B.O., M.W.), The University of Texas MD Anderson Center, Houston, Texas
| | - Mauricio Castillo
- Department of Radiology (M.C.), University of North Carolina School of Medicine, Chapel Hill, North Carolina
| |
Collapse
|
9
|
Wang Q. Beyond Pandora's box: vast potential with significant challenges. Forensic Sci Res 2025; 10:owae017. [PMID: 39990699 PMCID: PMC11839503 DOI: 10.1093/fsr/owae017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2023] [Accepted: 12/18/2023] [Indexed: 02/25/2025] Open
Affiliation(s)
- Qi Wang
- Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification, School of Forensic Medicine, Southern Medical University, Guangzhou, China
| |
Collapse
|
10
|
Ibrahim H, Liu F, Zaki Y, Rahwan T. Citation manipulation through citation mills and pre-print servers. Sci Rep 2025; 15:5480. [PMID: 39953094 PMCID: PMC11828878 DOI: 10.1038/s41598-025-88709-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Accepted: 01/30/2025] [Indexed: 02/17/2025] Open
Abstract
Citations are widely considered in scientists' evaluation. As such, scientists may be incentivized to inflate their citation counts. While previous literature has examined self-citations and citation cartels, it remains unclear whether scientists can purchase citations. Here, we compile a dataset of ~1.6 million profiles on Google Scholar to examine instances of citation fraud on the platform. We survey faculty at highly-ranked universities, and confirm that Google Scholar is widely used when evaluating scientists. We then engage with a citation-boosting service, and manage to purchase 50 citations while assuming the identity of a fictional author. Taken as a whole, our findings bring to light new forms of citation manipulation, and emphasize the need to look beyond citation counts.
Collapse
Affiliation(s)
- Hazem Ibrahim
- Department of Computer Science, New York University, Abu Dhabi, UAE
- Tandon School of Engineering, New York University, New York, USA
| | - Fengyuan Liu
- Department of Computer Science, New York University, Abu Dhabi, UAE
- Courant Institute of Mathematical Sciences, New York University, New York, NY, 10012, USA
| | - Yasir Zaki
- Department of Computer Science, New York University, Abu Dhabi, UAE.
| | - Talal Rahwan
- Department of Computer Science, New York University, Abu Dhabi, UAE.
| |
Collapse
|
11
|
Ravšelj D, Keržič D, Tomaževič N, Umek L, Brezovar N, A. Iahad N, Abdulla AA, Akopyan A, Aldana Segura MW, AlHumaid J, Allam MF, Alló M, Andoh RPK, Andronic O, Arthur YD, Aydın F, Badran A, Balbontín-Alvarado R, Ben Saad H, Bencsik A, Benning I, Besimi A, Bezerra DDS, Buizza C, Burro R, Bwalya A, Cachero C, Castillo-Briceno P, Castro H, Chai CS, Charalambous C, Chiu TKF, Clipa O, Colombari R, Corral Escobedo LJH, Costa E, Crețulescu RG, Crispino M, Cucari N, Dalton F, Demir Kaya M, Dumić-Čule I, Dwidienawati D, Ebardo R, Egbenya DL, Faris ME, Fečko M, Ferrinho P, Florea A, Fong CY, Francis Z, Ghilardi A, González-Fernández B, Hau D, Hossain MS, Hug T, Inasius F, Ismail MJ, Jahić H, Jessa MO, Kapanadze M, Kar SK, Kateeb ET, Kaya F, Khadri HO, Kikuchi M, Kobets VM, Kostova KM, Krasmane E, Lau J, Law WHC, Lazăr F, Lazović-Pita L, Lee VWY, Li J, López-Aguilar DV, Luca A, Luciano RG, Machin-Mastromatteo JD, Madi M, Manguele AL, Manrique RF, Mapulanga T, Marimon F, Marinova GI, Mas-Machuca M, Mejía-Rodríguez O, Meletiou-Mavrotheris M, Méndez-Prado SM, Meza-Cano JM, Mirķe E, Mishra A, Mital O, Mollica C, Morariu DI, Mospan N, Mukuka A, Navarro Jiménez SG, Nikaj I, Nisheva MM, et alRavšelj D, Keržič D, Tomaževič N, Umek L, Brezovar N, A. Iahad N, Abdulla AA, Akopyan A, Aldana Segura MW, AlHumaid J, Allam MF, Alló M, Andoh RPK, Andronic O, Arthur YD, Aydın F, Badran A, Balbontín-Alvarado R, Ben Saad H, Bencsik A, Benning I, Besimi A, Bezerra DDS, Buizza C, Burro R, Bwalya A, Cachero C, Castillo-Briceno P, Castro H, Chai CS, Charalambous C, Chiu TKF, Clipa O, Colombari R, Corral Escobedo LJH, Costa E, Crețulescu RG, Crispino M, Cucari N, Dalton F, Demir Kaya M, Dumić-Čule I, Dwidienawati D, Ebardo R, Egbenya DL, Faris ME, Fečko M, Ferrinho P, Florea A, Fong CY, Francis Z, Ghilardi A, González-Fernández B, Hau D, Hossain MS, Hug T, Inasius F, Ismail MJ, Jahić H, Jessa MO, Kapanadze M, Kar SK, Kateeb ET, Kaya F, Khadri HO, Kikuchi M, Kobets VM, Kostova KM, Krasmane E, Lau J, Law WHC, Lazăr F, Lazović-Pita L, Lee VWY, Li J, López-Aguilar DV, Luca A, Luciano RG, Machin-Mastromatteo JD, Madi M, Manguele AL, Manrique RF, Mapulanga T, Marimon F, Marinova GI, Mas-Machuca M, Mejía-Rodríguez O, Meletiou-Mavrotheris M, Méndez-Prado SM, Meza-Cano JM, Mirķe E, Mishra A, Mital O, Mollica C, Morariu DI, Mospan N, Mukuka A, Navarro Jiménez SG, Nikaj I, Nisheva MM, Nisiforou E, Njiku J, Nomnian S, Nuredini-Mehmedi L, Nyamekye E, Obadić A, Okela AH, Olenik-Shemesh D, Ostoj I, Peralta-Rizzo KJ, Peštek A, Pilav-Velić A, Pires DRM, Rabin E, Raccanello D, Ramie A, Rashid MMU, Reuter RAP, Reyes V, Rodrigues AS, Rodway P, Ručinská S, Sadzaglishvili S, Salem AAMS, Savić G, Schepman A, Shahpo SM, Snouber A, Soler E, Sonyel B, Stefanova E, Stone A, Strzelecki A, Tanaka T, Tapia Cortes C, Teira-Fachado A, Tilga H, Titko J, Tolmach M, Turmudi D, Varela-Candamio L, Vekiri I, Vicentini G, Woyo E, Yorulmaz Ö, Yunus SAS, Zamfir AM, Zhou M, Aristovnik A. Higher education students' perceptions of ChatGPT: A global study of early reactions. PLoS One 2025; 20:e0315011. [PMID: 39908277 PMCID: PMC11798494 DOI: 10.1371/journal.pone.0315011] [Show More Authors] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2024] [Accepted: 11/19/2024] [Indexed: 02/07/2025] Open
Abstract
The paper presents the most comprehensive and large-scale global study to date on how higher education students perceived the use of ChatGPT in early 2024. With a sample of 23,218 students from 109 countries and territories, the study reveals that students primarily used ChatGPT for brainstorming, summarizing texts, and finding research articles, with a few using it for professional and creative writing. They found it useful for simplifying complex information and summarizing content, but less reliable for providing information and supporting classroom learning, though some considered its information clearer than that from peers and teachers. Moreover, students agreed on the need for AI regulations at all levels due to concerns about ChatGPT promoting cheating, plagiarism, and social isolation. However, they believed ChatGPT could potentially enhance their access to knowledge and improve their learning experience, study efficiency, and chances of achieving good grades. While ChatGPT was perceived as effective in potentially improving AI literacy, digital communication, and content creation skills, it was less useful for interpersonal communication, decision-making, numeracy, native language proficiency, and the development of critical thinking skills. Students also felt that ChatGPT would boost demand for AI-related skills and facilitate remote work without significantly impacting unemployment. Emotionally, students mostly felt positive using ChatGPT, with curiosity and calmness being the most common emotions. Further examinations reveal variations in students' perceptions across different socio-demographic and geographic factors, with key factors influencing students' use of ChatGPT also being identified. Higher education institutions' managers and teachers may benefit from these findings while formulating the curricula and instructions/regulations for ChatGPT use, as well as when designing the teaching methods and assessment tools. Moreover, policymakers may also consider the findings when formulating strategies for secondary and higher education system development, especially in light of changing labor market needs and related digital skills development.
Collapse
Affiliation(s)
- Dejan Ravšelj
- Faculty of Public Administration, University of Ljubljana, Ljubljana, Slovenia
| | - Damijana Keržič
- Faculty of Public Administration, University of Ljubljana, Ljubljana, Slovenia
| | - Nina Tomaževič
- Faculty of Public Administration, University of Ljubljana, Ljubljana, Slovenia
| | - Lan Umek
- Faculty of Public Administration, University of Ljubljana, Ljubljana, Slovenia
| | - Nejc Brezovar
- Faculty of Public Administration, University of Ljubljana, Ljubljana, Slovenia
| | - Noorminshah A. Iahad
- Department of Information Systems, Faculty of Management, Universiti Teknologi Malaysia, Skudai, Johor Bahru, Malaysia
| | - Ali Abdulla Abdulla
- Department of Computer Science and IT, State University of Zanzibar (SUZA), Zanzibar, Tanzania
| | - Anait Akopyan
- Department of English for the Humanities, Southern Federal University, Rostov-on-Don, Russia
| | - Magdalena Waleska Aldana Segura
- Education Department, Galileo University, Guatemala, Guatemala
- Physics Department, San Carlos de Guatemala University, Guatemala, Guatemala
| | - Jehan AlHumaid
- Department of Preventive Dental Sciences, College of Dentistry, Imam Abdulrahman Bin Faisal University, Dammam, Saudi Arabia
| | | | - Maria Alló
- Department of Economics, Faculty of Economics and Business, University of A Coruna, A Coruna, Spain
| | - Raphael Papa Kweku Andoh
- Directorate of Research, Innovation and Consultancy, University of Cape Coast, Cape Coast, Ghana
| | - Octavian Andronic
- Innovation and eHealth Center, Carol Davila University of Medicine and Pharmacy, Bucharest, Romania
| | - Yarhands Dissou Arthur
- Department of Mathematics Education, Faculty of Applied Sciences and Mathematics Education, Akenten Appiah Menka University of Skills Training and Entrepreneurial Development (AAMUSTED), Kumasi, Ghana
| | - Fatih Aydın
- Faculty of Education, Sivas Cumhuriyet University, Sivas, Türkiye
| | - Amira Badran
- Faculty of Dentistry, Ain Shams University, Cairo, Egypt
| | | | - Helmi Ben Saad
- Research Laboratory LR12SP09 “Heart Failure”, Faculty of Medicine of Sousse, University of Sousse, Sousse, Tunisia
| | - Andrea Bencsik
- Department of Management, University of Pannonia, Veszprem, Hungary
- Department of Management, J. Selye University, Komarno, Slovakia
| | - Isaac Benning
- Department of Mathematics & ICT Education, University of Cape Coast, Cape Coast, Ghana
| | - Adrian Besimi
- Faculty of Contemporary Sciences and Technologies, South East European University, Tetovo, Republic of North Macedonia
| | | | - Chiara Buizza
- Department of Clinical and Experimental Sciences, University of Brescia, Brescia, Italy
| | - Roberto Burro
- Department of Human Sciences, University of Verona, Verona, Italy
| | - Anthony Bwalya
- Department of Biological Sciences, Kwame Nkrumah University, Kitwe, Zambia
| | - Cristina Cachero
- Languages and Computer Systems, University of Alicante, Alicante, Spain
| | - Patricia Castillo-Briceno
- EBIOAC Lab, Faculty of Life Sciences and Technologies, Universidad Laica Eloy Alfaro de Manabi ULEAM, Manta, Ecuador
| | - Harold Castro
- Department of Systems and Computing Engineering, Universidad de los Andes, Bogota, Colombia
| | - Ching Sing Chai
- Centre for Learning Sciences and Technologies, Chinese University of Hong Kong, Hong Kong, Hong Kong SAR, China
| | | | - Thomas K. F. Chiu
- Department of Curriculum and Instruction, Chinese University of Hong Kong, Hong Kong, Hong Kong SAR, China
| | - Otilia Clipa
- Science of Education, Stefan cel Mare University of Suceava, Suceava, Romania
| | - Ruggero Colombari
- Department of Economics and Social Sciences, International University of Catalonia, Barcelona, Spain
| | | | - Elísio Costa
- Competence Center on Active and Healthy Ageing and CINTESIS@Rise, Faculty of Pharmacy, University of Porto, Porto, Portugal
| | - Radu George Crețulescu
- Computer Science and Electrical Engineering Department, Faculty of Engineering, Lucian Blaga University of Sibiu, Sibiu, Romania
| | | | - Nicola Cucari
- Department of Management, Faculty of Economics, Sapienza University of Rome, Rome, Italy
| | - Fergus Dalton
- Department of Psychology, University of the Fraser Valley, Abbotsford, Canada
| | | | - Ivo Dumić-Čule
- Department of Nursing, University North, Varaždin, Croatia
| | | | - Ryan Ebardo
- Department of Information Technology, De La Salle University, Manila, Philippines
| | - Daniel Lawer Egbenya
- College of Health and Allied Sciences, University of Cape Coast, Cape Coast, Ghana
| | | | - Miroslav Fečko
- Faculty of Public Administration, Pavol Jozef Šafárik University in Košice, Košice, Slovakia
| | - Paulo Ferrinho
- Global Health and Tropical Medicine (GHTM), Associate Laboratory in Translation and Innovation Towards Global Health (LA-REAL), Instituto de Higiene e Medicina Tropical (IHMT), Nova University of Lisbon, Lisbon, Portugal
| | - Adrian Florea
- Computer Science and Electrical Engineering Department, Faculty of Engineering, Lucian Blaga University of Sibiu, Sibiu, Romania
| | - Chun Yuen Fong
- International College of Liberal Arts, Yamanashi Gakuin University, Kofu, Japan
| | - Zoë Francis
- Department of Psychology, University of the Fraser Valley, Abbotsford, Canada
| | - Alberto Ghilardi
- Department of Clinical and Experimental Sciences, University of Brescia, Brescia, Italy
| | | | - Daniela Hau
- Department of Education and Social Work, University of Luxembourg, Belval Esch-sur-Alzette, Luxembourg
| | - Md. Shamim Hossain
- Department of Marketing, Hajee Mohammad Danesh Science and Technology University, Dinajpur, Bangladesh
| | - Theo Hug
- Department of Media, Society and Communication, University of Innsbruck, Innsbruck, Austria
| | - Fany Inasius
- School of Accounting, Bina Nusantara University, Jakarta, Indonesia
| | | | - Hatidža Jahić
- School of Economics and Business, University of Sarajevo, Sarajevo, Bosnia and Herzegovina
| | | | - Marika Kapanadze
- School of Business Technology and Education, Ilia State University, Tbilisi, Georgia
| | - Sujita Kumar Kar
- Department of Psychiatry, King George’s Medical University, Lucknow, India
| | - Elham Talib Kateeb
- Oral Health Research and Promotion Unit, Al-Quds University. Jerusalem, Palestine
| | - Feridun Kaya
- Faculty of Letters, University of Ataturk, Erzurum, Turkiye
| | | | - Masao Kikuchi
- Department of Public Management. Meiji University, Tokyo, Japan
| | | | - Katerina Metodieva Kostova
- Department of Technologies and Management of Communication Systems, Faculty of Telecommunications, Technical University of Sofia, Sofia, Bulgaria
| | | | - Jesus Lau
- Faculty of Pedagogy, Universidad Veracruzana, Veracruz, Mexico
| | - Wai Him Crystal Law
- International College of Liberal Arts, Yamanashi Gakuin University, Kofu, Japan
| | - Florin Lazăr
- Sociology and Social Work, University of Bucharest, Bucharest, Romania
| | - Lejla Lazović-Pita
- School of Economics and Business, University of Sarajevo, Sarajevo, Bosnia and Herzegovina
| | - Vivian Wing Yan Lee
- Centre for Learning Enhancement and Research (CLEAR), Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - Jingtai Li
- School of Foreign Languages, Jiaying University, Meizhou, China
| | | | - Adrian Luca
- Department of Applied Psychology and Psychotherapy, Faculty of Psychology and Educational Sciences, University of Bucharest, Bucharest, Romania
| | - Ruth Garcia Luciano
- College of Information and Communications Technology, Nueva Ecija University of Science and Technology, Cabanatuan City, Nueva Ecija, Philippines
| | | | - Marwa Madi
- Department of Preventive Dental Sciences, College of Dentistry, Imam Abdulrahman Bin Faisal University, Dammam, Saudi Arabia
| | | | | | - Thumah Mapulanga
- African Centre of Excellence for Innovative Teaching and Learning Mathematics and Science, University of Rwanda, Kayonza, Rwanda
| | - Frederic Marimon
- Department of Economics and Social Sciences, International University of Catalonia, Barcelona, Spain
| | - Galia Ilieva Marinova
- Department of Technologies and Management of Communication Systems, Faculty of Telecommunications, Technical University of Sofia, Sofia, Bulgaria
| | - Marta Mas-Machuca
- Department of Economics and Social Sciences, International University of Catalonia, Barcelona, Spain
| | | | | | | | - José Manuel Meza-Cano
- Faculty of Higher Education Iztacala, National Autonomous University of Mexico, State of Mexico, Mexico
| | - Evija Mirķe
- Institute of Digital Humanities, Faculty of Computer Science, Information Technology and Energy, Riga Technical University, Riga. Latvia
| | - Alpana Mishra
- Department of Community Medicine, Kalinga Institute of Medical Science. Bhubaneswar, India
| | - Ondrej Mital
- Faculty of Public Administration, Pavol Jozef Šafárik University in Košice, Košice, Slovakia
| | - Cristina Mollica
- Department of Statistical Sciences, Sapienza University of Rome, Rome, Italy
| | - Daniel Ionel Morariu
- Computer Science and Electrical Engineering Department, Faculty of Engineering, Lucian Blaga University of Sibiu, Sibiu, Romania
| | - Natalia Mospan
- Department of Linguistics and Translation, Borys Grinchenko Kyiv University, Kyiv, Ukraine
| | - Angel Mukuka
- Department of Mathematics, Science and Technology Education, Mukuba University, Kitwe, Zambia
| | | | - Irena Nikaj
- Faculty of Education & Philology, University Fan S. Noli Korça, Korça, Albania
| | - Maria Mihaylova Nisheva
- Faculty of Mathematics and Informatics, Sofia University St. Kliment Ohridski, Sofia, Bulgaria
| | - Efi Nisiforou
- Department of Education, University of Nicosia, Nicosia, Cyprus
| | - Joseph Njiku
- Educational Psychology and Curriculum Studies, Dar es Salaam University College of Education, University of Dar es Salaam, Dar es Salaam, Tanzania
| | - Singhanat Nomnian
- Research Institute for Languages and Cultures of Asia, Mahidol University, Salaya, Thailand
| | | | - Ernest Nyamekye
- Department of Arts Education, University of Cape Coast, Cape Coast, Ghana
| | - Alka Obadić
- Faculty of Economics and Business, University of Zagreb, Zagreb, Croatia
| | | | - Dorit Olenik-Shemesh
- Education & Psychology, The Research Center for Innovation in Learning Technologies, Open University of Israel, Raanana, Israel
| | - Izabela Ostoj
- Department of Economics, Faculty of Economics, University of Economics in Katowice, Katowice, Poland
| | | | - Almir Peštek
- School of Economics and Business, University of Sarajevo, Sarajevo, Bosnia and Herzegovina
| | - Amila Pilav-Velić
- School of Economics and Business, University of Sarajevo, Sarajevo, Bosnia and Herzegovina
| | | | - Eyal Rabin
- Education & Psychology, The Research Center for Innovation in Learning Technologies, Open University of Israel, Raanana, Israel
| | | | - Agustine Ramie
- Nursing Department, Health Polytechnic of Banjarmasin, Banjarbaru, Indonesia
| | - Md. Mamun ur Rashid
- Department of Agricultural Extension and Rural Development, Patuakhali Science and Technology University, Patuakhali, Bangladesh
| | - Robert A. P. Reuter
- Department of Education and Social Work, University of Luxembourg, Belval Esch-sur-Alzette, Luxembourg
| | - Valentina Reyes
- Facultad de Economía y Negocios, Universidad de Chile, Santiago, Chile
| | | | - Paul Rodway
- Division of Psychology, Faculty of Health, Medicine and Society, University of Chester, Chester, United Kingdom
| | - Silvia Ručinská
- Faculty of Public Administration, Pavol Jozef Šafárik University in Košice, Košice, Slovakia
| | - Shorena Sadzaglishvili
- Research Center for Advancing Science in the Social Services and Interventions, Social Work Program, Faculty of Arts and Science, Ilia State University, Tbilisi, Georgia
| | - Ashraf Atta M. S. Salem
- College of Languages & Translation, Sadat Academy for Management Sciences, Alexandria, Egypt
| | - Gordana Savić
- Faculty of Organizational Sciences, University of Belgrade, Belgrade, Serbia
| | - Astrid Schepman
- Division of Psychology, Faculty of Health, Medicine and Society, University of Chester, Chester, United Kingdom
| | - Samia Mokhtar Shahpo
- Department of Early Childhood, College of Sciences and Humanities Studies, Imam Abdulrahman Bin Faisal University, Dammam, Saudi Arabia
| | | | - Emma Soler
- Department of Economics and Social Sciences, International University of Catalonia, Barcelona, Spain
| | - Bengi Sonyel
- Department of Educational Sciences, Eastern Mediterranean University, Famagusta, Cyprus
| | - Eliza Stefanova
- Faculty of Mathematics and Informatics, Sofia University St. Kliment Ohridski, Sofia, Bulgaria
| | - Anna Stone
- School of Psychology, University of East London, London, United Kingdom
| | - Artur Strzelecki
- Department of Informatics, University of Economics in Katowice, Katowice, Poland
| | - Tetsuji Tanaka
- Department of Economics, Meiji Gakuin University, Tokyo, Japan
| | | | - Andrea Teira-Fachado
- Department of Economics, Faculty of Economics and Business, University of A Coruna, A Coruna, Spain
- Public Law, Faculty of Law, University of A Coruna, A Coruna, Spain
| | - Henri Tilga
- Institute of Sport Sciences and Physiotherapy, University of Tartu, Tartu. Estonia
| | - Jelena Titko
- EKA University of Applied Sciences, Riga, Latvia
| | - Maryna Tolmach
- Faculty of Distance Learning, Kyiv National University of Culture and Arts, Kyiv, Ukraine
| | - Dedi Turmudi
- English Education Study Program, Faculty of Teachers’ Training and Education, Muhammadiyah University of Metro, Metro, Indonesia
| | - Laura Varela-Candamio
- Department of Economics, Faculty of Economics and Business, University of A Coruna, A Coruna, Spain
| | - Ioanna Vekiri
- Department of Education, European University Cyprus, Nicosia, Cyprus
| | - Giada Vicentini
- Department of Human Sciences, University of Verona, Verona, Italy
| | - Erisher Woyo
- Faculty of Business & Law, Manchester Metropolitan University, Manchester, United Kingdom
| | - Özlem Yorulmaz
- Department of Econometrics & Statistics, Faculty of Economics, Istanbul University, Istanbul, Türkiye
| | - Said A. S. Yunus
- School of Education, State University of Zanzibar (SUZA), Zanzibar, Tanzania
| | - Ana-Maria Zamfir
- Faculty of Business and Administration, University of Bucharest, Bucharest, Romania
- National Scientific Research Institute for Labour and Social Protection, Bucharest, Romania
| | - Munyaradzi Zhou
- Information and Marketing Sciences, Midlands State University, Gweru, Zimbabwe
| | | |
Collapse
|
12
|
Al‐Qudimat AR, Fares ZE, Elaarag M, Osman M, Al‐Zoubi RM, Aboumarzouk OM. Advancing Medical Research Through Artificial Intelligence: Progressive and Transformative Strategies: A Literature Review. Health Sci Rep 2025; 8:e70200. [PMID: 39980823 PMCID: PMC11839394 DOI: 10.1002/hsr2.70200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Revised: 07/23/2024] [Accepted: 10/28/2024] [Indexed: 02/22/2025] Open
Abstract
Background and Aims Artificial intelligence (AI) has become integral to medical research, impacting various aspects such as data analysis, writing assistance, and publishing. This paper explores the multifaceted influence of AI on the process of writing medical research papers, encompassing data analysis, ethical considerations, writing assistance, and publishing efficiency. Methods The review was conducted following the PRISMA guidelines; a comprehensive search was performed in Scopus, PubMed, EMBASE, and MEDLINE databases for research publications on artificial intelligence in medical research published up to October 2023. Results AI facilitates the writing process by generating drafts, offering grammar and style suggestions, and enhancing manuscript quality through advanced models like ChatGPT. Ethical concerns regarding content ownership and potential biases in AI-generated content underscore the need for collaborative efforts among researchers, publishers, and AI creators to establish ethical standards. Moreover, AI significantly influences data analysis in healthcare, optimizing outcomes and patient care, particularly in fields such as obstetrics and gynecology and pharmaceutical research. The application of AI in publishing, ranging from peer review to manuscript quality control and journal matching, underscores its potential to streamline and enhance the entire research and publication process. Overall, while AI presents substantial benefits, ongoing research, and ethical guidelines are essential for its responsible integration into the evolving landscape of medical research and publishing. Conclusion The integration of AI in medical research has revolutionized efficiency and innovation, impacting data analysis, writing assistance, publishing, and others. While AI tools offer significant benefits, ethical considerations such as biases and content ownership must be addressed. Ongoing research and collaborative efforts are crucial to ensure responsible and transparent AI implementation in the dynamic landscape of medical research and publishing.
Collapse
Affiliation(s)
- Ahmad R. Al‐Qudimat
- Department of Surgery, Hamad Medical CorporationSurgical Research SectionDohaQatar
- Department of Public Health, College of Health Sciences, QU‐HealthQatar UniversityDohaQatar
| | - Zainab E. Fares
- Department of Surgery, Hamad Medical CorporationSurgical Research SectionDohaQatar
| | - Mai Elaarag
- Department of Surgery, Hamad Medical CorporationSurgical Research SectionDohaQatar
| | - Maha Osman
- Department of Public Health, College of Health Sciences, QU‐HealthQatar UniversityDohaQatar
| | - Raed M. Al‐Zoubi
- Department of Surgery, Hamad Medical CorporationSurgical Research SectionDohaQatar
- Department of Biomedical Sciences, College of Health Sciences, QU‐HealthQatar UniversityDohaQatar
- Department of Chemistry, College of ScienceJordan University of Science and TechnologyIrbidJordan
| | - Omar M. Aboumarzouk
- Department of Surgery, Hamad Medical CorporationSurgical Research SectionDohaQatar
- School of Medicine, Dentistry and NursingThe University of GlasgowGlasgowUK
| |
Collapse
|
13
|
Alencar-Palha C, Ocampo T, Silva TP, Neves FS, Oliveira ML. Performance of a Generative Pre-Trained Transformer in Generating Scientific Abstracts in Dentistry: A Comparative Observational Study. EUROPEAN JOURNAL OF DENTAL EDUCATION : OFFICIAL JOURNAL OF THE ASSOCIATION FOR DENTAL EDUCATION IN EUROPE 2025; 29:149-154. [PMID: 39562504 DOI: 10.1111/eje.13057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Revised: 09/25/2024] [Accepted: 11/04/2024] [Indexed: 11/21/2024]
Abstract
OBJECTIVES To evaluate the performance of a Generative Pre-trained Transformer (GPT) in generating scientific abstracts in dentistry. METHODS Ten scientific articles in dental radiology had their original abstracts collected, while another 10 articles had their methodology and results added to a ChatGPT prompt to generate an abstract. All abstracts were randomised and compiled into a single file for subsequent assessment. Five evaluators classified whether the abstract was generated by a human using a 5-point scale and provided justifications within seven aspects: formatting, information accuracy, orthography, punctuation, terminology, text fluency, and writing style. Furthermore, an online GPT detector provided "Human Score" values, and a plagiarism detector assessed similarity with existing literature. RESULTS Sensitivity values for detecting human writing ranged from 0.20 to 0.70, with a mean of 0.58; specificity values ranged from 0.40 to 0.90, with a mean of 0.62; and accuracy values ranged from 0.50 to 0.80, with a mean of 0.60. Orthography and Punctuation were the most indicated aspects for the abstract generated by ChatGPT. The GPT detector revealed confidence levels for a "Human Score" of 16.9% for the AI-generated texts and plagiarism levels averaging 35%. CONCLUSION The GPT exhibited commendable performance in generating scientific abstracts when evaluated by humans, as the generated abstracts were indistinguishable from those generated by humans. When evaluated by an online GPT detector, the use of GPT became apparent.
Collapse
Affiliation(s)
- Caio Alencar-Palha
- Division of Oral Radiology, Department of Oral Diagnosis, Piracicaba Dental School, University of Campinas, Piracicaba, São Paulo, Brazil
| | - Thais Ocampo
- Division of Oral Radiology, Department of Oral Diagnosis, Piracicaba Dental School, University of Campinas, Piracicaba, São Paulo, Brazil
| | - Thaisa Pinheiro Silva
- Division of Oral Radiology, Department of Oral Diagnosis, Piracicaba Dental School, University of Campinas, Piracicaba, São Paulo, Brazil
| | - Frederico Sampaio Neves
- Division of Oral Radiology, Department of Propedeutics and Integrated Clinic, School of Dentistry, Federal University of Bahia, Salvador, Bahia, Brazil
| | - Matheus L Oliveira
- Division of Oral Radiology, Department of Oral Diagnosis, Piracicaba Dental School, University of Campinas, Piracicaba, São Paulo, Brazil
| |
Collapse
|
14
|
Nabata KJ, AlShehri Y, Mashat A, Wiseman SM. Evaluating human ability to distinguish between ChatGPT-generated and original scientific abstracts. Updates Surg 2025:10.1007/s13304-025-02106-3. [PMID: 39853655 DOI: 10.1007/s13304-025-02106-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2024] [Accepted: 01/14/2025] [Indexed: 01/26/2025]
Abstract
This study aims to analyze the accuracy of human reviewers in identifying scientific abstracts generated by ChatGPT compared to the original abstracts. Participants completed an online survey presenting two research abstracts: one generated by ChatGPT and one original abstract. They had to identify which abstract was generated by AI and provide feedback on their preference and perceptions of AI technology in academic writing. This observational cross-sectional study involved surgical trainees and faculty at the University of British Columbia. The survey was distributed to all surgeons and trainees affiliated with the University of British Columbia, which includes general surgery, orthopedic surgery, thoracic surgery, plastic surgery, cardiovascular surgery, vascular surgery, neurosurgery, urology, otolaryngology, pediatric surgery, and obstetrics and gynecology. A total of 41 participants completed the survey. 41 participants responded, comprising 10 (23.3%) surgeons. Eighteen (40.0%) participants correctly identified the original abstract. Twenty-six (63.4%) participants preferred the ChatGPT abstract (p = 0.0001). On multivariate analysis, preferring the original abstract was associated with correct identification of the original abstract [OR 7.46, 95% CI (1.78, 31.4), p = 0.006]. Results suggest that human reviewers cannot accurately distinguish between human and AI-generated abstracts, and overall, there was a trend toward a preference for AI-generated abstracts. The findings contributed to understanding the implications of AI in manuscript production, including its benefits and ethical considerations.
Collapse
Affiliation(s)
- Kylie J Nabata
- Department of Surgery, St. Paul's Hospital, 1081 Burrard St., Vancouver, BC, V6Z 1Y6, Canada
- University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Yasir AlShehri
- Department of Orthopaedic Surgery, Faculty of Medicine, The University of British Columbia, 2775 Laurel St., Vancouver, BC, V5Z 1M9, Canada
| | - Abdullah Mashat
- Department of Surgery, St. Paul's Hospital, 1081 Burrard St., Vancouver, BC, V6Z 1Y6, Canada
- University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | - Sam M Wiseman
- Department of Surgery, St. Paul's Hospital, 1081 Burrard St., Vancouver, BC, V6Z 1Y6, Canada.
- University of British Columbia, Vancouver, BC, V6T 1Z4, Canada.
| |
Collapse
|
15
|
Moskovich L, Rozani V. Health profession students' perceptions of ChatGPT in healthcare and education: insights from a mixed-methods study. BMC MEDICAL EDUCATION 2025; 25:98. [PMID: 39833868 PMCID: PMC11748239 DOI: 10.1186/s12909-025-06702-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2024] [Accepted: 01/13/2025] [Indexed: 01/22/2025]
Abstract
OBJECTIVE The aim of this study was to investigate the perceptions of health profession students regarding ChatGPT use and the potential impact of integrating ChatGPT in healthcare and education. BACKGROUND Artificial Intelligence is increasingly utilized in medical education and clinical profession training. However, since its introduction, ChatGPT remains relatively unexplored in terms of health profession students' acceptance of its use in education and practice. DESIGN This study employed a mixed-methods approach, using a web-based survey. METHODS The study involved a convenience sample recruited through various methods, including Faculty of Medicine announcements, social media, and snowball sampling, during the second semester (March to June 2023). Data were collected using a structured questionnaire with closed-ended questions and three open-ended questions. The final sample comprised 217 undergraduate health profession students, including 73 (33.6%) nursing students, 65 (30.0%) medical students, and 79 (36.4%) occupational therapy, physiotherapy, and speech therapy students. RESULTS Among the surveyed students, 86.2% were familiar with ChatGPT, with generally positive perceptions as reflected by a mean score of 4.04 (SD = 0.62) on a scale of 1 to 5. Positive feedback was particularly noted with respect to ChatGPT's role in information retrieval and summarization. The qualitative data revealed three main themes: experiences with ChatGPT, its impact on the quality of healthcare, and its integration into the curriculum. The findings highlight benefits such as serving as a convenient tool for accessing information, reducing human errors, and fostering innovative learning approaches. However, they also underscore areas of concern, including ethical considerations, challenges in fostering critical thinking, and issues related to verification. The absence of significant differences between the different fields of study indicates consistent perceptions across nursing, medicine, and other health profession students. CONCLUSIONS Our findings underscore the necessity for continuous refinement to enhance ChatGPT's accuracy, reliability, and alignment with the diverse educational needs of health professions. These insights not only deepen our understanding of student perceptions of ChatGPT in healthcare education but also have significant implications for the future integration of AI in health profession practice. The study emphasizes the importance of a careful balance between leveraging the benefits of AI tools and addressing ethical and pedagogical concerns.
Collapse
Affiliation(s)
- Lior Moskovich
- Faculty of Medical and Health Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Violetta Rozani
- Department of Nursing Sciences, Faculty of Medical and Health Sciences, The Stanley Steyer School of Health Professions, Tel Aviv University, Tel Aviv, Israel.
| |
Collapse
|
16
|
Kim J, Vajravelu BN. Assessing the Current Limitations of Large Language Models in Advancing Health Care Education. JMIR Form Res 2025; 9:e51319. [PMID: 39819585 PMCID: PMC11756841 DOI: 10.2196/51319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 08/31/2024] [Accepted: 09/03/2024] [Indexed: 01/19/2025] Open
Abstract
Unlabelled The integration of large language models (LLMs), as seen with the generative pretrained transformers series, into health care education and clinical management represents a transformative potential. The practical use of current LLMs in health care sparks great anticipation for new avenues, yet its embracement also elicits considerable concerns that necessitate careful deliberation. This study aims to evaluate the application of state-of-the-art LLMs in health care education, highlighting the following shortcomings as areas requiring significant and urgent improvements: (1) threats to academic integrity, (2) dissemination of misinformation and risks of automation bias, (3) challenges with information completeness and consistency, (4) inequity of access, (5) risks of algorithmic bias, (6) exhibition of moral instability, (7) technological limitations in plugin tools, and (8) lack of regulatory oversight in addressing legal and ethical challenges. Future research should focus on strategically addressing the persistent challenges of LLMs highlighted in this paper, opening the door for effective measures that can improve their application in health care education.
Collapse
Affiliation(s)
- JaeYong Kim
- School of Pharmacy, Massachusetts College of Pharmacy and Health Sciences, Boston, MA, United States
| | - Bathri Narayan Vajravelu
- Department of Physician Assistant Studies, Massachusetts College of Pharmacy and Health Sciences, 179 Longwood Avenue, Boston, MA, 02115, United States, 1 6177322961
| |
Collapse
|
17
|
Schwartzman JD, Shaath MK, Kerr MS, Green CC, Haidukewych GJ. ChatGPT is an Unreliable Source of Peer-Reviewed Information for Common Total Knee and Hip Arthroplasty Patient Questions. Adv Orthop 2025; 2025:5534704. [PMID: 39817149 PMCID: PMC11729512 DOI: 10.1155/aort/5534704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/28/2024] [Accepted: 11/27/2024] [Indexed: 01/18/2025] Open
Abstract
Background: Advances in artificial intelligence (AI), machine learning, and publicly accessible language model tools such as ChatGPT-3.5 continue to shape the landscape of modern medicine and patient education. ChatGPT's open access (OA), instant, human-sounding interface capable of carrying discussion on myriad topics makes it a potentially useful resource for patients seeking medical advice. As it pertains to orthopedic surgery, ChatGPT may become a source to answer common preoperative questions regarding total knee arthroplasty (TKA) and total hip arthroplasty (THA). Since ChatGPT can utilize the peer-reviewed literature to source its responses, this study seeks to characterize the validity of its responses to common TKA and THA questions and characterize the peer-reviewed literature that it uses to formulate its responses. Methods: Preoperative TKA and THA questions were formulated by fellowship-trained adult reconstruction surgeons based on common questions posed by patients in the clinical setting. Questions were inputted into ChatGPT with the initial request of using solely the peer-reviewed literature to generate its responses. The validity of each response was rated on a Likert scale by the fellowship-trained surgeons, and the sources utilized were characterized in terms of accuracy of comparison to existing publications, publication date, study design, level of evidence, journal of publication, journal impact factor based on the clarivate analytics factor tool, journal OA status, and whether the journal is based in the United States. Results: A total of 109 sources were cited by ChatGPT in its answers to 17 questions regarding TKA procedures and 16 THA procedures. Thirty-nine sources (36%) were deemed accurate or able to be directly traced to an existing publication. Of these, seven (18%) were identified as duplicates, yielding a total of 32 unique sources that were identified as accurate and further characterized. The most common characteristics of these sources included dates of publication between 2011 and 2015 (10), publication in The Journal of Bone and Joint Surgery (13), journal impact factors between 5.1 and 10.0 (17), internationally based journals (17), and journals that are not OA (28). The most common study designs were retrospective cohort studies and case series (seven each). The level of evidence was broadly distributed between Levels I, III, and IV (seven each). The averages for the Likert scales for medical accuracy and completeness were 4.4/6 and 1.92/3, respectively. Conclusions: Investigation into ChatGPT's response quality and use of peer-reviewed sources when prompted with archetypal pre-TKA and pre-THA questions found ChatGPT to provide mostly reliable responses based on fellowship-trained orthopedic surgeon review of 4.4/6 for accuracy and 1.92/3 for completeness despite a 64.22% rate of citing inaccurate references. This study suggests that until ChatGPT is proven to be a reliable source of valid information and references, patients must exercise extreme caution in directing their pre-TKA and THA questions to this medium.
Collapse
Affiliation(s)
| | - M. Kareem Shaath
- Orlando Health Jewett Orthopedic Institute, Orlando, Florida, USA
| | - Matthew S. Kerr
- Orthopaedic Surgery Department, Cleveland Clinic Florida, Weston, Florida, USA
| | - Cody C. Green
- Orlando Health Jewett Orthopedic Institute, Orlando, Florida, USA
| | | |
Collapse
|
18
|
Porto JR, Morgan KA, Hecht CJ, Burkhart RJ, Liu RW. Quantifying the Scope of Artificial Intelligence-Assisted Writing in Orthopaedic Medical Literature: An Analysis of Prevalence and Validation of AI-Detection Software. J Am Acad Orthop Surg 2025; 33:42-50. [PMID: 39602700 DOI: 10.5435/jaaos-d-24-00084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Accepted: 08/12/2024] [Indexed: 11/29/2024] Open
Abstract
INTRODUCTION The popularization of generative artificial intelligence (AI), including Chat Generative Pre-trained Transformer (ChatGPT), has raised concerns for the integrity of academic literature. This study asked the following questions: (1) Has the popularization of publicly available generative AI, such as ChatGPT, increased the prevalence of AI-generated orthopaedic literature? (2) Can AI detectors accurately identify ChatGPT-generated text? (3) Are there associations between article characteristics and the likelihood that it was AI generated? METHODS PubMed was searched across six major orthopaedic journals to identify articles received for publication after January 1, 2023. Two hundred and forty articles were randomly selected and entered into three popular AI detectors. Twenty articles published by each journal before the release of ChatGPT were randomly selected as negative control articles. 36 positive control articles (6 per journal) were created by altering 25%, 50%, and 100% of text from negative control articles using ChatGPT and were then used to validate each detector. The mean percentage of text detected as written by AI per detector was compared between pre-ChatGPT and post-ChatGPT release articles using independent t -test. Multivariate regression analysis was conducted using percentage AI-generated text per journal, article type (ie, cohort, clinical trial, review), and month of submission. RESULTS One AI detector consistently and accurately identified AI-generated text in positive control articles, whereas two others showed poor sensitivity and specificity. The most accurate detector showed a modest increase in the percentage AI detected for the articles received post release of ChatGPT (+1.8%, P = 0.01). Regression analysis showed no consistent associations between likelihood of AI-generated text per journal, article type, or month of submission. CONCLUSIONS As this study found an early, albeit modest, effect of generative AI on the orthopaedic literature, proper oversight will play a critical role in maintaining research integrity and accuracy. AI detectors may play a critical role in regulatory efforts, although they will require further development and standardization to the interpretation of their results.
Collapse
Affiliation(s)
- Joshua R Porto
- From the Department of Orthopaedic Surgery, University Hospitals of Cleveland, Case Western Reserve University, Cleveland, OH (Porto, Morgan, Hecht, Burkhart, and Liu), and the Case Western Reserve University School of Medicine, Cleveland, OH (Porto, Morgan, and Hecht)
| | | | | | | | | |
Collapse
|
19
|
Patel SJ, Notarianni AP, Martin AK, Tsai A, Pulton DA, Linganna RE, Bhatte S, Montealegre-Gallegos M, Patel B, Waldron NH, Nimma SR, Kothari P, Kiwakyou L, Baskin SM, Feinman JW. The Year in Graduate Medical Education: Selected Highlights from 2023. J Cardiothorac Vasc Anesth 2024; 38:2906-2914. [PMID: 39261208 DOI: 10.1053/j.jvca.2024.05.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Revised: 05/01/2024] [Accepted: 05/02/2024] [Indexed: 09/13/2024]
Abstract
This special article is the third in an annual series of the Journal of Cardiothoracic and Vascular Anesthesia that highlights significant literature from the world of graduate medical education published over the past year. Major themes addressed in this review include the potential uses and pitfalls of artificial intelligence in graduate medical education, trainee well-being and the rise of unionized house staff, the effect of gender and race/ethnicity on residency application and attrition rates, and the adoption of novel technologies in medical simulation and education. The authors thank the editorial board for again allowing us to draw attention to some of the more interesting work published in the field of graduate medical education during 2023. We hope that the readers find these highlights thought-provoking and informative as we all strive to successfully educate the next generation of anesthesiologists.
Collapse
Affiliation(s)
- Saumil J Patel
- Department of Anesthesiology and Critical Care, Perelman School of Medicine, Philadelphia, PA
| | - Andrew P Notarianni
- Cardiothoracic Division, Department of Anesthesiology, Yale University School of Medicine, New Haven, CT
| | - Archer Kilbourne Martin
- Division of Cardiovascular and Thoracic Anesthesiology, Department of Anesthesiology and Perioperative Medicine, Mayo Clinic, Jacksonville, FL
| | - Albert Tsai
- Division of Cardiothoracic Anesthesiology, Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University School of Medicine, Stanford, CA
| | - Danielle A Pulton
- Department of Anesthesiology, Temple University Hospital/Lewis Katz School of Medicine, Philadelphia, PA
| | - Regina E Linganna
- Department of Anesthesiology and Critical Care, Perelman School of Medicine, Philadelphia, PA
| | - Sai Bhatte
- Perelman School of Medicine, Philadelphia, PA
| | - Mario Montealegre-Gallegos
- Cardiothoracic Division, Department of Anesthesiology, Yale University School of Medicine, New Haven, CT
| | - Bhoumesh Patel
- Cardiothoracic Division, Department of Anesthesiology, Yale University School of Medicine, New Haven, CT
| | - Nathan H Waldron
- Division of Cardiovascular and Thoracic Anesthesiology, Department of Anesthesiology and Perioperative Medicine, Mayo Clinic, Jacksonville, FL
| | - Sindhuja R Nimma
- Division of Cardiovascular and Thoracic Anesthesiology, Department of Anesthesiology and Perioperative Medicine, Mayo Clinic, Jacksonville, FL
| | - Perin Kothari
- Division of Cardiothoracic Anesthesiology, Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University School of Medicine, Stanford, CA
| | - Larissa Kiwakyou
- Division of Cardiothoracic Anesthesiology, Department of Anesthesiology, Perioperative, and Pain Medicine, Stanford University School of Medicine, Stanford, CA
| | - Sean M Baskin
- Department of Anesthesiology, Temple University Hospital/Lewis Katz School of Medicine, Philadelphia, PA
| | - Jared W Feinman
- Department of Anesthesiology and Critical Care, Perelman School of Medicine, Philadelphia, PA.
| |
Collapse
|
20
|
Wang J, Liao Y, Liu S, Zhang D, Wang N, Shu J, Wang R. The impact of using ChatGPT on academic writing among medical undergraduates. Ann Med 2024; 56:2426760. [PMID: 39555617 PMCID: PMC11574940 DOI: 10.1080/07853890.2024.2426760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/28/2024] [Revised: 08/12/2024] [Accepted: 10/08/2024] [Indexed: 11/19/2024] Open
Abstract
BACKGROUND ChatGPT is widely used for writing tasks, yet its effects on medical students' academic writing remain underexplored. This study aims to elucidate ChatGPT's impact on academic writing efficiency and quality among medical students, while also evaluating students' attitudes towards its use in academic writing. METHODS We collected systematic reviews from 130 third-year medical students and administered a questionnaire to assess ChatGPT usage and student attitudes. Three independent reviewers graded the papers using EASE guidelines, and statistical analysis compared articles generated with or without ChatGPT assistance across various parameters, with rigorous quality control ensuring survey reliability and validity. RESULTS In this study, 33 students (25.8%) utilized ChatGPT for writing (ChatGPT group) and 95 (74.2%) did not (Control group). The ChatGPT group exhibited significantly higher daily technology use and prior experience with ChatGPT (p < 0.05). Writing time was significantly reduced in the ChatGPT group (p = 0.04), with 69.7% completing tasks within 2-3 days compared to 48.4% in the control group. They also achieved higher article quality scores (p < 0.0001) with improvements in completeness, credibility, and scientific content. Self-assessment indicated enhanced writing skills (p < 0.01), confidence (p < 0.001), satisfaction (p < 0.001) and a positive attitude toward its future use in the ChatGPT group. CONCLUSIONS Integrating ChatGPT in medical academic writing, with proper guidance, improves efficiency and quality, illustrating artificial intelligence's potential in shaping medical education methodologies.
Collapse
Affiliation(s)
- Jingyu Wang
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha, Hunan Province, China
- Xiangya School of Medicine, Central South University, Changsha, Hunan Province, China
| | - Yuxuan Liao
- National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences Peking Union Medical College, Beijing, China
- Graduate School of Peking Union Medical College, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Shaojun Liu
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha, Hunan Province, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha, Hunan Province, China
| | - Decai Zhang
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha, Hunan Province, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha, Hunan Province, China
| | - Na Wang
- The First People's Hospital of Foshan, Foshan, China
| | - Jiankun Shu
- The First People's Hospital of Foshan, Foshan, China
- The First School of Clinical Medicine, Southern Medical University, Guangzhou, China
| | - Rui Wang
- Department of Gastroenterology, The Third Xiangya Hospital of Central South University, Changsha, Hunan Province, China
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Changsha, Hunan Province, China
| |
Collapse
|
21
|
Whitrock JN, Pratt CG, Carter MM, Chae RC, Price AD, Justiniano CF, Van Haren RM, Silski LS, Quillin RC, Shah SA. Does using artificial intelligence take the person out of personal statements? We can't tell. Surgery 2024; 176:1610-1616. [PMID: 39299851 DOI: 10.1016/j.surg.2024.08.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2024] [Revised: 06/11/2024] [Accepted: 08/12/2024] [Indexed: 09/22/2024]
Abstract
BACKGROUND Use of artificial intelligence to generate personal statements for residency is currently not permitted but is difficult to monitor. This study sought to evaluate the ability of surgical residency application reviewers to identify artificial intelligence-generated personal statements and to understand perceptions of this practice. METHODS Three personal statements were generated using ChatGPT, and 3 were written by medical students who previously matched into surgery residency. Blinded participants at a single institution were instructed to read all personal statements and identify which were generated by artificial intelligence; they then completed a survey exploring their opinions regarding artificial intelligence use. RESULTS Of the 30 participants, 50% were faculty (n = 15) and 50% were residents (n = 15). Overall, experience ranged from 0 to 20 years (median, 2 years; interquartile range, 1-6.25 years). Artificial intelligence-derived personal statements were identified correctly only 59% of the time, with 3 (10%) participants identifying all the artificial intelligence-derived personal statements correctly. Artificial intelligence-generated personal statements were labeled as the best 60% of the time and the worst 43.3% of the time. When asked whether artificial intelligence use should be allowed in personal statements writing, 66.7% (n = 20) said no and 30% (n = 9) said yes. When asked if the use of artificial intelligence would impact their opinion of an applicant, 80% (n = 24) said yes, and 20% (n = 6) said no. When survey questions and ability to identify artificial intelligence-generated personal statements were evaluated by faculty/resident status and experience, no differences were noted (P > .05). CONCLUSION This study shows that surgical faculty and residents cannot reliably identify artificial intelligence-generated personal statements and that concerns exist regarding the impact of artificial intelligence on the application process.
Collapse
Affiliation(s)
- Jenna N Whitrock
- Cincinnati Research in Outcomes and Safety in Surgery (CROSS) Research Group, Department of Surgery, University of Cincinnati College of Medicine, Cincinnati, OH.
| | - Catherine G Pratt
- Cincinnati Research in Outcomes and Safety in Surgery (CROSS) Research Group, Department of Surgery, University of Cincinnati College of Medicine, Cincinnati, OH
| | - Michela M Carter
- Cincinnati Research in Outcomes and Safety in Surgery (CROSS) Research Group, Department of Surgery, University of Cincinnati College of Medicine, Cincinnati, OH
| | - Ryan C Chae
- Cincinnati Research in Outcomes and Safety in Surgery (CROSS) Research Group, Department of Surgery, University of Cincinnati College of Medicine, Cincinnati, OH
| | - Adam D Price
- Cincinnati Research in Outcomes and Safety in Surgery (CROSS) Research Group, Department of Surgery, University of Cincinnati College of Medicine, Cincinnati, OH
| | - Carla F Justiniano
- Cincinnati Research in Outcomes and Safety in Surgery (CROSS) Research Group, Department of Surgery, University of Cincinnati College of Medicine, Cincinnati, OH; Division of Colon and Rectal Surgery, Department of Surgery, University of Cincinnati College of Medicine, Cincinnati, OH
| | - Robert M Van Haren
- Cincinnati Research in Outcomes and Safety in Surgery (CROSS) Research Group, Department of Surgery, University of Cincinnati College of Medicine, Cincinnati, OH; Division of Cardiothoracic Surgery, Department of Surgery, University of Cincinnati College of Medicine, Cincinnati, OH
| | - Latifa S Silski
- Cincinnati Research in Outcomes and Safety in Surgery (CROSS) Research Group, Department of Surgery, University of Cincinnati College of Medicine, Cincinnati, OH; Division of Transplantation, Department of Surgery, University of Cincinnati College of Medicine, Cincinnati, OH
| | - Ralph C Quillin
- Cincinnati Research in Outcomes and Safety in Surgery (CROSS) Research Group, Department of Surgery, University of Cincinnati College of Medicine, Cincinnati, OH; Division of Transplantation, Department of Surgery, University of Cincinnati College of Medicine, Cincinnati, OH
| | - Shimul A Shah
- Cincinnati Research in Outcomes and Safety in Surgery (CROSS) Research Group, Department of Surgery, University of Cincinnati College of Medicine, Cincinnati, OH; Division of Transplantation, Department of Surgery, University of Cincinnati College of Medicine, Cincinnati, OH
| |
Collapse
|
22
|
Kolding S, Lundin RM, Hansen L, Østergaard SD. Use of generative artificial intelligence (AI) in psychiatry and mental health care: a systematic review. Acta Neuropsychiatr 2024; 37:e37. [PMID: 39523628 DOI: 10.1017/neu.2024.50] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2024]
Abstract
OBJECTIVES Tools based on generative artificial intelligence (AI) such as ChatGPT have the potential to transform modern society, including the field of medicine. Due to the prominent role of language in psychiatry, e.g., for diagnostic assessment and psychotherapy, these tools may be particularly useful within this medical field. Therefore, the aim of this study was to systematically review the literature on generative AI applications in psychiatry and mental health. METHODS We conducted a systematic review following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. The search was conducted across three databases, and the resulting articles were screened independently by two researchers. The content, themes, and findings of the articles were qualitatively assessed. RESULTS The search and screening process resulted in the inclusion of 40 studies. The median year of publication was 2023. The themes covered in the articles were mainly mental health and well-being in general - with less emphasis on specific mental disorders (substance use disorder being the most prevalent). The majority of studies were conducted as prompt experiments, with the remaining studies comprising surveys, pilot studies, and case reports. Most studies focused on models that generate language, ChatGPT in particular. CONCLUSIONS Generative AI in psychiatry and mental health is a nascent but quickly expanding field. The literature mainly focuses on applications of ChatGPT, and finds that generative AI performs well, but notes that it is limited by significant safety and ethical concerns. Future research should strive to enhance transparency of methods, use experimental designs, ensure clinical relevance, and involve users/patients in the design phase.
Collapse
Affiliation(s)
- Sara Kolding
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
- Department of Affective Disorders, Aarhus University Hospital - Psychiatry, Aarhus, Denmark
- Center for Humanities Computing, Aarhus University, Aarhus, Denmark
| | - Robert M Lundin
- Deakin University, Institute for Mental and Physical Health and Clinical Translation (IMPACT), Geelong, VIC, Australia
- Mildura Base Public Hospital, Mental Health Services, Alcohol and Other Drugs Integrated Treatment Team, Mildura, VIC, Australia
- Barwon Health, Change to Improve Mental Health (CHIME), Mental Health Drugs and Alcohol Services, Geelong, VIC, Australia
| | - Lasse Hansen
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
- Department of Affective Disorders, Aarhus University Hospital - Psychiatry, Aarhus, Denmark
- Center for Humanities Computing, Aarhus University, Aarhus, Denmark
| | - Søren Dinesen Østergaard
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
- Department of Affective Disorders, Aarhus University Hospital - Psychiatry, Aarhus, Denmark
| |
Collapse
|
23
|
Chen A, Qilleri A, Foster T, Rao AS, Gopalakrishnan S, Niezgoda J, Oropallo A. Generative Artificial Intelligence: Applications in Scientific Writing and Data Analysis in Wound Healing Research. Adv Skin Wound Care 2024; 37:601-607. [PMID: 39792511 DOI: 10.1097/asw.0000000000000226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2025]
Abstract
ABSTRACT Generative artificial intelligence (AI) models are a new technological development with vast research use cases among medical subspecialties. These powerful large language models offer a wide range of possibilities in wound care, from personalized patient support to optimized treatment plans and improved scientific writing. They can also assist in efficiently navigating the literature and selecting and summarizing articles, enabling researchers to focus on impactful studies relevant to wound care management and enhancing response quality through prompt-learning iterations. For nonnative English-speaking medical practitioners and authors, generative AI may aid in grammar and vocabulary selection. Although reports have suggested limitations of the conversational agent on medical translation pertaining to the precise interpretation of medical context, when used with verified resources, this language model can breach language barriers and promote practice-changing advancements in global wound care. Further, AI-powered chatbots can enable continuous monitoring of wound healing progress and real-time insights into treatment responses through frequent, readily available remote patient follow-ups.However, implementing AI in wound care research requires careful consideration of potential limitations, especially in accurately translating complex medical terms and workflows. Ethical considerations are vital to ensure reliable and credible wound care research when using AI technologies. Although ChatGPT shows promise for transforming wound care management, the authors warn against overreliance on the technology. Considering the potential limitations and risks, proper validation and oversight are essential to unlock its true potential while ensuring patient safety and the effectiveness of wound care treatments.
Collapse
Affiliation(s)
- Adrian Chen
- At the Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, New York, United States, Adrian Chen, BS, Aleksandra Qilleri, BS, and Timothy Foster, BS, are Medical Students. Amit S. Rao, MD, is Project Manager, Department of Surgery, Wound Care Division, Northwell Wound Healing Center and Hyperbarics, Northwell Health, Hempstead. Sandeep Gopalakrishnan, PhD, MAPWCA, is Associate Professor and Director, Wound Healing and Tissue Repair Analytics Laboratory, School of Nursing, College of Health Professions, University of Wisconsin-Milwaukee. Jeffrey Niezgoda, MD, MAPWCA, is Founder and President Emeritus, AZH Wound Care and Hyperbaric Oxygen Therapy Center, Milwaukee, and President and Chief Medical Officer, WebCME, Greendale, Wisconsin. Alisha Oropallo, MD, is Professor of Surgery, Donald and Barbara Zucker School of Medicine and The Feinstein Institutes for Medical Research, Manhasset New York; Director, Comprehensive Wound Healing Center, Northwell Health; and Program Director, Wound and Burn Fellowship program, Northwell Health
| | | | | | | | | | | | | |
Collapse
|
24
|
Hirata K, Matsui Y, Yamada A, Fujioka T, Yanagawa M, Nakaura T, Ito R, Ueda D, Fujita S, Tatsugami F, Fushimi Y, Tsuboyama T, Kamagata K, Nozaki T, Fujima N, Kawamura M, Naganawa S. Generative AI and large language models in nuclear medicine: current status and future prospects. Ann Nucl Med 2024; 38:853-864. [PMID: 39320419 DOI: 10.1007/s12149-024-01981-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2024] [Accepted: 09/13/2024] [Indexed: 09/26/2024]
Abstract
This review explores the potential applications of Large Language Models (LLMs) in nuclear medicine, especially nuclear medicine examinations such as PET and SPECT, reviewing recent advancements in both fields. Despite the rapid adoption of LLMs in various medical specialties, their integration into nuclear medicine has not yet been sufficiently explored. We first discuss the latest developments in nuclear medicine, including new radiopharmaceuticals, imaging techniques, and clinical applications. We then analyze how LLMs are being utilized in radiology, particularly in report generation, image interpretation, and medical education. We highlight the potential of LLMs to enhance nuclear medicine practices, such as improving report structuring, assisting in diagnosis, and facilitating research. However, challenges remain, including the need for improved reliability, explainability, and bias reduction in LLMs. The review also addresses the ethical considerations and potential limitations of AI in healthcare. In conclusion, LLMs have significant potential to transform existing frameworks in nuclear medicine, making it a critical area for future research and development.
Collapse
Affiliation(s)
- Kenji Hirata
- Department of Diagnostic Imaging, Graduate School of Medicine, Hokkaido University, Kita 15, Nishi 7, Kita-Ku, Sapporo, Hokkaido, 060-8638, Japan.
| | - Yusuke Matsui
- Department of Radiology, Faculty of Medicine, Dentistry and Pharmaceutical Sciences, Okayama University, Kita-Ku, Okayama, Japan
| | - Akira Yamada
- Medical Data Science Course, Shinshu University School of Medicine, Matsumoto, Nagano, Japan
| | - Tomoyuki Fujioka
- Department of Diagnostic Radiology, Tokyo Medical and Dental University, Bunkyo-Ku, Tokyo, Japan
| | - Masahiro Yanagawa
- Department of Radiology, Osaka University Graduate School of Medicine, Suita-City, Osaka, Japan
| | - Takeshi Nakaura
- Department of Diagnostic Radiology, Kumamoto University Graduate School of Medicine, Chuo-Ku, Kumamoto, Japan
| | - Rintaro Ito
- Department of Radiology, Nagoya University Graduate School of Medicine, Showa-Ku, Nagoya, Japan
| | - Daiju Ueda
- Department of Artificial Intelligence, Graduate School of Medicine, Osaka Metropolitan University, Abeno-Ku, Osaka, Japan
| | - Shohei Fujita
- Department of Radiology, Graduate School of Medicine and Faculty of Medicine, The University of Tokyo, Bunkyo-Ku, Tokyo, Japan
| | - Fuminari Tatsugami
- Department of Diagnostic Radiology, Hiroshima University, Minami-Ku, Hiroshima, Japan
| | - Yasutaka Fushimi
- Department of Diagnostic Imaging and Nuclear Medicine, Kyoto University Graduate School of Medicine, Sakyoku, Kyoto, Japan
| | - Takahiro Tsuboyama
- Department of Radiology, Kobe University Graduate School of Medicine, Chuo-Ku, Kobe, Japan
| | - Koji Kamagata
- Department of Radiology, Juntendo University Graduate School of Medicine, Bunkyo-Ku, Tokyo, Japan
| | - Taiki Nozaki
- Department of Radiology, Keio University School of Medicine, Shinjuku-Ku, Tokyo, Japan
| | - Noriyuki Fujima
- Department of Diagnostic and Interventional Radiology, Hokkaido University Hospital, Kita-Ku, Sapporo, Japan
| | - Mariko Kawamura
- Department of Radiology, Nagoya University Graduate School of Medicine, Showa-Ku, Nagoya, Japan
| | - Shinji Naganawa
- Department of Radiology, Nagoya University Graduate School of Medicine, Showa-Ku, Nagoya, Japan
| |
Collapse
|
25
|
Warn M, Meller LLT, Chan D, Torabi SJ, Bitner BF, Tajudeen BA, Kuan EC. Assessing the Readability, Reliability, and Quality of AI-Modified and Generated Patient Education Materials for Endoscopic Skull Base Surgery. Am J Rhinol Allergy 2024; 38:396-402. [PMID: 39169720 DOI: 10.1177/19458924241273055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2024]
Abstract
BACKGROUND Despite National Institutes of Health and American Medical Association recommendations to publish online patient education materials at or below sixth-grade literacy, those pertaining to endoscopic skull base surgery (ESBS) have lacked readability and quality. ChatGPT is an artificial intelligence (AI) system capable of synthesizing vast internet data to generate responses to user queries but its utility in improving patient education materials has not been explored. OBJECTIVE To examine the current state of readability and quality of online patient education materials and determined the utility of ChatGPT for improving articles and generating patient education materials. METHODS An article search was performed utilizing 10 different search terms related to ESBS. The ten least readable existing patient-facing articles were modified with ChatGPT and iterative queries were used to generate an article de novo. The Flesch Reading Ease (FRE) and related metrics measured overall readability and content literacy level, while DISCERN assessed article reliability and quality. RESULTS Sixty-six articles were located. ChatGPT improved FRE readability of the 10 least readable online articles (19.7 ± 4.4 vs. 56.9 ± 5.9, p < 0.001), from university to 10th grade level. The generated article was more readable than 48.5% of articles (38.9 vs. 39.4 ± 12.4) and higher quality than 94% (51.0 vs. 37.6 ± 6.1). 56.7% of the online articles had "poor" quality. CONCLUSIONS ChatGPT improves the readability of articles, though most still remain above the recommended literacy level for patient education materials. With iterative queries, ChatGPT can generate more reliable and higher quality patient education materials compared to most existing online articles and can be tailored to match readability of average online articles.
Collapse
Affiliation(s)
- Michael Warn
- Riverside School of Medicine, University of California, Riverside, California
| | - Leo L T Meller
- San Diego School of Medicine, University of California, San Diego, California
| | - Daniella Chan
- Department of Otolaryngology - Head and Neck Surgery, University of California, Irvine, Orange, California
| | - Sina J Torabi
- Department of Otolaryngology - Head and Neck Surgery, University of California, Irvine, Orange, California
| | - Benjamin F Bitner
- Department of Otolaryngology - Head and Neck Surgery, University of California, Irvine, Orange, California
| | - Bobby A Tajudeen
- Department of Otolaryngology - Head and Neck Surgery, Rush University, Chicago, Illinois
| | - Edward C Kuan
- Department of Otolaryngology - Head and Neck Surgery, University of California, Irvine, Orange, California
| |
Collapse
|
26
|
Leon M, Ruaengsri C, Pelletier G, Bethencourt D, Shibata M, Flores MQ, Shudo Y. Harnessing the Power of ChatGPT in Cardiovascular Medicine: Innovations, Challenges, and Future Directions. J Clin Med 2024; 13:6543. [PMID: 39518681 PMCID: PMC11546989 DOI: 10.3390/jcm13216543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Revised: 10/08/2024] [Accepted: 10/29/2024] [Indexed: 11/16/2024] Open
Abstract
Cardiovascular diseases remain the leading cause of morbidity and mortality globally, posing significant challenges to public health. The rapid evolution of artificial intelligence (AI), particularly with large language models such as ChatGPT, has introduced transformative possibilities in cardiovascular medicine. This review examines ChatGPT's broad applications in enhancing clinical decision-making-covering symptom analysis, risk assessment, and differential diagnosis; advancing medical education for both healthcare professionals and patients; and supporting research and academic communication. Key challenges associated with ChatGPT, including potential inaccuracies, ethical considerations, data privacy concerns, and inherent biases, are discussed. Future directions emphasize improving training data quality, developing specialized models, refining AI technology, and establishing regulatory frameworks to enhance ChatGPT's clinical utility and mitigate associated risks. As cardiovascular medicine embraces AI, ChatGPT stands out as a powerful tool with substantial potential to improve therapeutic outcomes, elevate care quality, and advance research innovation. Fully understanding and harnessing this potential is essential for the future of cardiovascular health.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Yasuhiro Shudo
- Department of Cardiothoracic Surgery, Stanford University School of Medicine, 300 Pasteur Drive, Falk CVRB, Stanford, CA 94305, USA; (C.R.); (G.P.); (D.B.); (M.Q.F.)
| |
Collapse
|
27
|
Nógrádi B, Polgár TF, Meszlényi V, Kádár Z, Hertelendy P, Csáti A, Szpisjak L, Halmi D, Erdélyi-Furka B, Tóth M, Molnár F, Tóth D, Bősze Z, Boda K, Klivényi P, Siklós L, Patai R. ChatGPT M.D.: Is there any room for generative AI in neurology? PLoS One 2024; 19:e0310028. [PMID: 39383119 PMCID: PMC11463752 DOI: 10.1371/journal.pone.0310028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 08/22/2024] [Indexed: 10/11/2024] Open
Abstract
ChatGPT, a general artificial intelligence, has been recognized as a powerful tool in scientific writing and programming but its use as a medical tool is largely overlooked. The general accessibility, rapid response time and comprehensive training database might enable ChatGPT to serve as a diagnostic augmentation tool in certain clinical settings. The diagnostic process in neurology is often challenging and complex. In certain time-sensitive scenarios, rapid evaluation and diagnostic decisions are needed, while in other cases clinicians are faced with rare disorders and atypical disease manifestations. Due to these factors, the diagnostic accuracy in neurology is often suboptimal. Here we evaluated whether ChatGPT can be utilized as a valuable and innovative diagnostic augmentation tool in various neurological settings. We used synthetic data generated by neurological experts to represent descriptive anamneses of patients with known neurology-related diseases, then the probability for an appropriate diagnosis made by ChatGPT was measured. To give clarity to the accuracy of the AI-determined diagnosis, all cases have been cross-validated by other experts and general medical doctors as well. We found that ChatGPT-determined diagnostic accuracy (ranging from 68.5% ± 3.28% to 83.83% ± 2.73%) can reach the accuracy of other experts (81.66% ± 2.02%), furthermore, it surpasses the probability of an appropriate diagnosis if the examiner is a general medical doctor (57.15% ± 2.64%). Our results showcase the efficacy of general artificial intelligence like ChatGPT as a diagnostic augmentation tool in medicine. In the future, AI-based supporting tools might be useful amendments in medical practice and help to improve the diagnostic process in neurology.
Collapse
Affiliation(s)
- Bernát Nógrádi
- Institute of Biophysics, HUN-REN Biological Research Centre, Szeged, Hungary
- Department of Neurology, Albert Szent-Györgyi Health Centre, University of Szeged, Szeged, Hungary
| | - Tamás Ferenc Polgár
- Institute of Biophysics, HUN-REN Biological Research Centre, Szeged, Hungary
- Theoretical Medicine Doctoral School, University of Szeged, Szeged, Hungary
| | - Valéria Meszlényi
- Institute of Biophysics, HUN-REN Biological Research Centre, Szeged, Hungary
- Department of Neurology, Albert Szent-Györgyi Health Centre, University of Szeged, Szeged, Hungary
| | - Zalán Kádár
- Institute of Biophysics, HUN-REN Biological Research Centre, Szeged, Hungary
| | - Péter Hertelendy
- Department of Neurology, Albert Szent-Györgyi Health Centre, University of Szeged, Szeged, Hungary
| | - Anett Csáti
- Department of Neurology, Albert Szent-Györgyi Health Centre, University of Szeged, Szeged, Hungary
| | - László Szpisjak
- Department of Neurology, Albert Szent-Györgyi Health Centre, University of Szeged, Szeged, Hungary
| | - Dóra Halmi
- Metabolic Diseases and Cell Signaling Research Group, Department of Biochemistry, Albert Szent-Györgyi Medical School, University of Szeged, Szeged, Hungary
- Interdisciplinary Medicine Doctoral School, University of Szeged, Szeged, Hungary
| | - Barbara Erdélyi-Furka
- Metabolic Diseases and Cell Signaling Research Group, Department of Biochemistry, Albert Szent-Györgyi Medical School, University of Szeged, Szeged, Hungary
- Interdisciplinary Medicine Doctoral School, University of Szeged, Szeged, Hungary
| | - Máté Tóth
- Second Department of Internal Medicine and Cardiology Centre, Albert Szent-Györgyi Health Centre, University of Szeged, Szeged, Hungary
| | - Fanny Molnár
- Department of Family Medicine, Albert Szent-Györgyi Health Centre, University of Szeged, Szeged, Hungary
| | - Dávid Tóth
- Department of Oncotherapy, Albert Szent-Györgyi Health Centre, University of Szeged, Szeged, Hungary
| | - Zsófia Bősze
- Department of Internal Medicine, Albert Szent-Györgyi Health Centre, University of Szeged, Szeged, Hungary
| | - Krisztina Boda
- Department of Medical Physics and Informatics, University of Szeged, Szeged, Hungary
| | - Péter Klivényi
- Department of Neurology, Albert Szent-Györgyi Health Centre, University of Szeged, Szeged, Hungary
| | - László Siklós
- Institute of Biophysics, HUN-REN Biological Research Centre, Szeged, Hungary
| | - Roland Patai
- Institute of Biophysics, HUN-REN Biological Research Centre, Szeged, Hungary
| |
Collapse
|
28
|
Naja F, Taktouk M, Matbouli D, Khaleel S, Maher A, Uzun B, Alameddine M, Nasreddine L. Artificial intelligence chatbots for the nutrition management of diabetes and the metabolic syndrome. Eur J Clin Nutr 2024; 78:887-896. [PMID: 39060542 DOI: 10.1038/s41430-024-01476-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Revised: 07/16/2024] [Accepted: 07/17/2024] [Indexed: 07/28/2024]
Abstract
BACKGROUND Recently, there has been a growing interest in exploring AI-driven chatbots, such as ChatGPT, as a resource for disease management and education. OBJECTIVE The study aims to evaluate ChatGPT's accuracy and quality/clarity in providing nutritional management for Type 2 Diabetes (T2DM), the Metabolic syndrome (MetS) and its components, in accordance with the Academy of Nutrition and Dietetics' guidelines. METHODS Three nutrition management-related domains were considered: (1) Dietary management, (2) Nutrition care process (NCP) and (3) Menu planning (1500 kcal). A total of 63 prompts were used. Two experienced dietitians evaluated the chatbot output's concordance with the guidelines. RESULTS Both dietitians provided similar assessments for most conditions examined in the study. Gaps in the ChatGPT-derived outputs were identified and included weight loss recommendations, energy deficit, anthropometric assessment, specific nutrients of concern and the adoption of specific dietary interventions. Gaps in physical activity recommendations were also observed, highlighting ChatGPT's limitations in providing holistic lifestyle interventions. Within the NCP, the generated output provided incomplete examples of diagnostic documentation statements and had significant gaps in the monitoring and evaluation step. In the 1500 kcal one-day menus, the amounts of carbohydrates, fat, vitamin D and calcium were discordant with dietary recommendations. Regarding clarity, dietitians rated the output as either good or excellent. CONCLUSION Although ChatGPT is an increasingly available resource for practitioners, users are encouraged to consider the gaps identified in this study in the dietary management of T2DM and the MetS.
Collapse
Affiliation(s)
- Farah Naja
- Department of Clinical Nutrition and Dietetics, College of Health Sciences, Research Institute of Medical and Health Sciences (RIMHS), University of Sharjah, Sharjah, United Arab Emirates
- Department of Nutrition and Food Sciences, Faculty of Agricultural and Food Sciences, American University of Beirut (AUB), Beirut, Lebanon
| | - Mandy Taktouk
- Department of Nutrition and Food Sciences, Faculty of Agricultural and Food Sciences, American University of Beirut (AUB), Beirut, Lebanon
| | - Dana Matbouli
- Department of Nutrition and Food Sciences, Faculty of Agricultural and Food Sciences, American University of Beirut (AUB), Beirut, Lebanon
| | - Sharfa Khaleel
- Department of Clinical Nutrition and Dietetics, College of Health Sciences, Research Institute of Medical and Health Sciences (RIMHS), University of Sharjah, Sharjah, United Arab Emirates
| | - Ayah Maher
- Department of Clinical Nutrition and Dietetics, College of Health Sciences, Research Institute of Medical and Health Sciences (RIMHS), University of Sharjah, Sharjah, United Arab Emirates
| | - Berna Uzun
- Department of Mathematics, Near East University, Nicosia, Turkey
| | | | - Lara Nasreddine
- Department of Nutrition and Food Sciences, Faculty of Agricultural and Food Sciences, American University of Beirut (AUB), Beirut, Lebanon.
| |
Collapse
|
29
|
Albuck AL, Becnel CM, Sirna DJ, Turner J. Precision of Chatbot Generative Pretrained Transformer Version 4-Generated References for Colon and Rectal Surgical Literature. J Surg Res 2024; 302:324-328. [PMID: 39121800 DOI: 10.1016/j.jss.2024.07.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 03/12/2024] [Accepted: 07/07/2024] [Indexed: 08/12/2024]
Abstract
INTRODUCTION The objective is to assess the precision of references generated by Chatbot Generative Pretrained Transformer version 4 (ChatGPT-4) in scientific literature pertaining to colon and rectal surgery. METHODS Ten frequently studied keywords pertaining to colon and rectal surgery were chosen: colon cancer, rectal cancer, anal cancer, total neoadjuvant therapy, diverticulitis, low anterior resection, transanal minimally invasive surgery, ileal pouch anal anastomosis, abdominoperineal resection, and hemorrhoidectomy. ChatGPT-4 was prompted to search for the most representative citations for all keywords. After this, two separate evaluators meticulously examined the outcomes each key element, awarding full accuracy to generated citations in which there was no discrepancies in any of the fields when cross-referenced with the Scopus, Google, and PubMed databases. References from ChatGPT-4 underwent a thorough review process, which involved careful examination of key elements such as the article title, authors, journal name, publication year, and Digital Object Identifier (DOI). RESULTS Forty-one of the 100 references generated by were fully accurate; however, but none included a DOI. Partial accuracy was observed in 67 of the references, which were identifiable by title and journal. Performance varied across specific keywords; for example, references for colon and rectal cancer were 100% identifiable by title and journal, but no term had 100% accuracy across all categories. Notably, none of the generated references correctly listed all authors. Conducted within a short timeframe during which ChatGPT4 is rapidly evolving and updating its knowledge base. CONCLUSIONS While ChatGPT-4 offers improvements over its predecessors and shows potential for use in academic literature, its inconsistent performance across categories, lack of DOIs, and irregularities in authorship listings raise concerns about its readiness for application in the field of colon and rectal surgery research.
Collapse
Affiliation(s)
- Aaron L Albuck
- School of Medicine, Tulane University, New Orleans, Louisiana
| | - Chad M Becnel
- School of Medicine, Tulane University, New Orleans, Louisiana; Ochsner Clinic Foundation, New Orleans, Louisiana
| | - Daniel J Sirna
- School of Medicine, Tulane University, New Orleans, Louisiana
| | - Jacquelyn Turner
- Division of Endocrine and Oncologic Surgery, Department of Surgery, Tulane University School of Medicine, New Orleans, Louisiana.
| |
Collapse
|
30
|
SeyedAlinaghi S, Mirzapour P, Mehraeen E. ChatGPT in Healthcare Writing: Advantages and Limitations. Healthc Inform Res 2024; 30:416-418. [PMID: 39551928 PMCID: PMC11570662 DOI: 10.4258/hir.2024.30.4.416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2024] [Revised: 10/17/2024] [Accepted: 10/18/2024] [Indexed: 11/19/2024] Open
Affiliation(s)
- SeyedAhmad SeyedAlinaghi
- Iranian Research Center for HIV/AIDS, Iranian Institute for Reduction of High Risk Behaviors, Tehran University of Medical Sciences, Tehran,
Iran
- Research Development Center, Arash Women Hospital, Tehran University of Medical Sciences, Tehran,
Iran
| | - Pegah Mirzapour
- Iranian Research Center for HIV/AIDS, Iranian Institute for Reduction of High Risk Behaviors, Tehran University of Medical Sciences, Tehran,
Iran
| | - Esmaeil Mehraeen
- Department of Health Information Technology, Khalkhal University of Medical Sciences, Khalkhal,
Iran
| |
Collapse
|
31
|
Filetti S, Fenza G, Gallo A. Research design and writing of scholarly articles: new artificial intelligence tools available for researchers. Endocrine 2024; 85:1104-1116. [PMID: 39085566 DOI: 10.1007/s12020-024-03977-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Accepted: 07/22/2024] [Indexed: 08/02/2024]
|
32
|
Lareyre F, Nasr B, Poggi E, Lorenzo GD, Ballaith A, Sliti I, Chaudhuri A, Raffort J. Large language models and artificial intelligence chatbots in vascular surgery. Semin Vasc Surg 2024; 37:314-320. [PMID: 39277347 DOI: 10.1053/j.semvascsurg.2024.06.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 06/12/2024] [Accepted: 06/14/2024] [Indexed: 09/17/2024]
Abstract
Natural language processing is a subfield of artificial intelligence that aims to analyze human oral or written language. The development of large language models has brought innovative perspectives in medicine, including the potential use of chatbots and virtual assistants. Nevertheless, the benefits and pitfalls of such technology need to be carefully evaluated before their use in health care. The aim of this narrative review was to provide an overview of potential applications of large language models and artificial intelligence chatbots in the field of vascular surgery, including clinical practice, research, and education. In light of the results, we discuss current limits and future directions.
Collapse
Affiliation(s)
- Fabien Lareyre
- Department of Vascular Surgery, Hospital of Antibes Juan-les-Pins, France; Université Côte d'Azur, Centre National de la Recherche Scientifique (CNRS), UMR7370, Laboratoire de Physiomédecine Moléculaire (LP2M), Nice, France; Fédération Hospitalo-Universitaire FHU Plan & Go, Nice, France
| | - Bahaa Nasr
- University of Brest, Institut National de la Santé et de la Recherche Médicale (INSERM), IMT-Atlantique, UMR 1011 LaTIM, Vascular and Endovascular Surgery Department, CHU Cavale Blanche, Brest, France
| | - Elise Poggi
- Department of Vascular Surgery, Hospital of Antibes Juan-les-Pins, France
| | - Gilles Di Lorenzo
- Department of Vascular Surgery, Hospital of Antibes Juan-les-Pins, France
| | - Ali Ballaith
- Department of Cardiovascular Surgery, Zayed Military Hospital, Abu Dhabi, United Arab Emirates
| | - Imen Sliti
- Department of Vascular Surgery, Hospital of Antibes Juan-les-Pins, France
| | - Arindam Chaudhuri
- Bedfordshire - Milton Keynes Vascular Centre, Bedfordshire Hospitals, National Health Service Foundation Trust, Bedford, UK
| | - Juliette Raffort
- Université Côte d'Azur, Centre National de la Recherche Scientifique (CNRS), UMR7370, Laboratoire de Physiomédecine Moléculaire (LP2M), Nice, France; Fédération Hospitalo-Universitaire FHU Plan & Go, Nice, France; Clinical Chemistry Laboratory, University Hospital of Nice, France; Institute 3IA Côte d'Azur, Université Côte d'Azur, France; Department of Clinical Biochemistry, Hôpital Pasteur, Pavillon J, 30, Avenue de la Voie Romaine, 06001 Nice cedex 1, France.
| |
Collapse
|
33
|
Buzzaccarini G, Degliuomini RS, Etrusco A, Giannini A, D’Amato A, Gkouvi K, Berreni N, Magon N, Candiani M, Salvatore S. The role of artificial intelligence in cosmetic and functional gynecology: Stepping into the third millennium. Eur J Obstet Gynecol Reprod Biol X 2024; 23:100322. [PMID: 39035703 PMCID: PMC11254585 DOI: 10.1016/j.eurox.2024.100322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 06/15/2024] [Indexed: 07/23/2024] Open
Abstract
Cosmetic and functional gynecology have gained popularity among patients, but the scientific literature in this field, particularly regarding the cosmetic aspect, is lacking. The use of evidence-based medicine is crucial to validate diagnostic tools and treatment protocols. However, the advent of artificial intelligence (AI) offers a promising solution to address this issue. ChatGPT, a sophisticated language model, can revolutionize AI in medicine, enabling accurate diagnosis, personalized treatment plans, and expedited research analysis. Cosmetic and functional gynecology can leverage AI to develop the field and improve evidence gathering. AI can aid in precise and personalized diagnosis, implement standardized assessment tools, simulate treatment outcomes, and assess under-skin anatomy through virtual reality. AI tools can assist clinicians in diagnosing and comparing difficult cases, calculate treatment risks, and contribute to standardization by collecting global evidence and generating guidelines. The use of AI in cosmetic and functional gynecology holds significant potential to advance the field and improve patient outcomes. This novel combination of AI and gynecology represents a groundbreaking development in medicine, emphasizing the importance of appropriate and correct AI usage.
Collapse
Affiliation(s)
- Giovanni Buzzaccarini
- Obstetrics and Gynaecology Unit, IRCCS San Raffaele Scientific Institute, Vita-Salute San Raffaele University, via Olgettina 48-60, Milan, Italy
| | - Rebecca Susanna Degliuomini
- Obstetrics and Gynaecology Unit, IRCCS San Raffaele Scientific Institute, Vita-Salute San Raffaele University, via Olgettina 48-60, Milan, Italy
| | - Andrea Etrusco
- Unit of Gynecologic Oncology, ARNAS "Civico – Di Cristina – Benfratelli", Department of Health Promotion, Mother and Child Care, Internal Medicine and Medical Specialties (PROMISE), University of Palermo, 90127 Palermo, Italy
| | - Andrea Giannini
- Department of Medical and Surgical Sciences and Translational Medicine, PhD Course in “Translational Medicine and Oncology”, Sapienza University, 00185 Rome, Italy
| | - Antonio D’Amato
- 1st Unit of Obstetrics and Gynecology, Department of Interdisciplinary Medicine, University of Bari, Bari, Italy
| | | | | | - Navneet Magon
- All India Institute of Medical Sciences, Rishikesh, Uttarakhand, India
| | - Massimo Candiani
- Obstetrics and Gynaecology Unit, IRCCS San Raffaele Scientific Institute, Vita-Salute San Raffaele University, via Olgettina 48-60, Milan, Italy
| | - Stefano Salvatore
- Obstetrics and Gynaecology Unit, IRCCS San Raffaele Scientific Institute, Vita-Salute San Raffaele University, via Olgettina 48-60, Milan, Italy
| |
Collapse
|
34
|
Suleiman A, von Wedel D, Munoz-Acuna R, Redaelli S, Santarisi A, Seibold EL, Ratajczak N, Kato S, Said N, Sundar E, Goodspeed V, Schaefer MS. Assessing ChatGPT's ability to emulate human reviewers in scientific research: A descriptive and qualitative approach. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 254:108313. [PMID: 38954915 DOI: 10.1016/j.cmpb.2024.108313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 06/20/2024] [Accepted: 06/27/2024] [Indexed: 07/04/2024]
Abstract
BACKGROUND ChatGPT is an AI platform whose relevance in the peer review of scientific articles is steadily growing. Nonetheless, it has sparked debates over its potential biases and inaccuracies. This study aims to assess ChatGPT's ability to qualitatively emulate human reviewers in scientific research. METHODS We included the first submitted version of the latest twenty original research articles published by the 3rd of July 2023, in a high-profile medical journal. Each article underwent evaluation by a minimum of three human reviewers during the initial review stage. Subsequently, three researchers with medical backgrounds and expertise in manuscript revision, independently and qualitatively assessed the agreement between the peer reviews generated by ChatGPT version GPT-4 and the comments provided by human reviewers for these articles. The level of agreement was categorized into complete, partial, none, or contradictory. RESULTS 720 human reviewers' comments were assessed. There was a good agreement between the three assessors (Overall kappa >0.6). ChatGPT's comments demonstrated complete agreement in terms of quality and substance with 48 (6.7 %) human reviewers' comments, partially agreed with 92 (12.8 %), identifying issues necessitating further elaboration or recommending supplementary steps to address concerns, had no agreement with a significant 565 (78.5 %), and contradicted 15 (2.1 %). ChatGPT comments on methods had the lowest proportion of complete agreement (13 comments, 3.6 %), while general comments on the manuscript displayed the highest proportion of complete agreement (17 comments, 22.1 %). CONCLUSION ChatGPT version GPT-4 has a limited ability to emulate human reviewers within the peer review process of scientific research.
Collapse
Affiliation(s)
- Aiman Suleiman
- Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Department of Anesthesia, Critical Care and Pain Medicine, Albert Einstein College of Medicine, Montefiore Medical Center, Bronx, NY, USA.
| | - Dario von Wedel
- Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Ricardo Munoz-Acuna
- Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Simone Redaelli
- Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Abeer Santarisi
- Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Department of Emergency Medicine, Disaster Medicine Fellowship, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Eva-Lotte Seibold
- Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Nikolai Ratajczak
- Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Shinichiro Kato
- Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Nader Said
- Department of Industrial Engineering, Faculty of Engineering Technologies and Sciences, Higher Colleges of Technology, DWC, Dubai, United Arab Emirates
| | - Eswar Sundar
- Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Valerie Goodspeed
- Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Maximilian S Schaefer
- Department of Anesthesia, Critical Care and Pain Medicine, Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Center for Anesthesia Research Excellence (CARE), Harvard Medical School, Beth Israel Deaconess Medical Center, Boston, MA, USA; Klinik für Anästhesiologie, Universitätsklinikum Düsseldorf, Düsseldorf, Germany
| |
Collapse
|
35
|
Pividori M, Greene CS. A publishing infrastructure for Artificial Intelligence (AI)-assisted academic authoring. J Am Med Inform Assoc 2024; 31:2103-2113. [PMID: 38879443 PMCID: PMC11339502 DOI: 10.1093/jamia/ocae139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 05/23/2024] [Accepted: 05/29/2024] [Indexed: 06/25/2024] Open
Abstract
OBJECTIVE Investigate the use of advanced natural language processing models to streamline the time-consuming process of writing and revising scholarly manuscripts. MATERIALS AND METHODS For this purpose, we integrate large language models into the Manubot publishing ecosystem to suggest revisions for scholarly texts. Our AI-based revision workflow employs a prompt generator that incorporates manuscript metadata into templates, generating section-specific instructions for the language model. The model then generates revised versions of each paragraph for human authors to review. We evaluated this methodology through 5 case studies of existing manuscripts, including the revision of this manuscript. RESULTS Our results indicate that these models, despite some limitations, can grasp complex academic concepts and enhance text quality. All changes to the manuscript are tracked using a version control system, ensuring transparency in distinguishing between human- and machine-generated text. CONCLUSIONS Given the significant time researchers invest in crafting prose, incorporating large language models into the scholarly writing process can significantly improve the type of knowledge work performed by academics. Our approach also enables scholars to concentrate on critical aspects of their work, such as the novelty of their ideas, while automating tedious tasks like adhering to specific writing styles. Although the use of AI-assisted tools in scientific authoring is controversial, our approach, which focuses on revising human-written text and provides change-tracking transparency, can mitigate concerns regarding AI's role in scientific writing.
Collapse
Affiliation(s)
- Milton Pividori
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO 80045, United States
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, United States
| | - Casey S Greene
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO 80045, United States
- Center for Health AI, Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO 80045, United States
| |
Collapse
|
36
|
Hindelang M, Sitaru S, Zink A. Transforming Health Care Through Chatbots for Medical History-Taking and Future Directions: Comprehensive Systematic Review. JMIR Med Inform 2024; 12:e56628. [PMID: 39207827 PMCID: PMC11393511 DOI: 10.2196/56628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 05/08/2024] [Accepted: 07/11/2024] [Indexed: 09/04/2024] Open
Abstract
BACKGROUND The integration of artificial intelligence and chatbot technology in health care has attracted significant attention due to its potential to improve patient care and streamline history-taking. As artificial intelligence-driven conversational agents, chatbots offer the opportunity to revolutionize history-taking, necessitating a comprehensive examination of their impact on medical practice. OBJECTIVE This systematic review aims to assess the role, effectiveness, usability, and patient acceptance of chatbots in medical history-taking. It also examines potential challenges and future opportunities for integration into clinical practice. METHODS A systematic search included PubMed, Embase, MEDLINE (via Ovid), CENTRAL, Scopus, and Open Science and covered studies through July 2024. The inclusion and exclusion criteria for the studies reviewed were based on the PICOS (participants, interventions, comparators, outcomes, and study design) framework. The population included individuals using health care chatbots for medical history-taking. Interventions focused on chatbots designed to facilitate medical history-taking. The outcomes of interest were the feasibility, acceptance, and usability of chatbot-based medical history-taking. Studies not reporting on these outcomes were excluded. All study designs except conference papers were eligible for inclusion. Only English-language studies were considered. There were no specific restrictions on study duration. Key search terms included "chatbot*," "conversational agent*," "virtual assistant," "artificial intelligence chatbot," "medical history," and "history-taking." The quality of observational studies was classified using the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) criteria (eg, sample size, design, data collection, and follow-up). The RoB 2 (Risk of Bias) tool assessed areas and the levels of bias in randomized controlled trials (RCTs). RESULTS The review included 15 observational studies and 3 RCTs and synthesized evidence from different medical fields and populations. Chatbots systematically collect information through targeted queries and data retrieval, improving patient engagement and satisfaction. The results show that chatbots have great potential for history-taking and that the efficiency and accessibility of the health care system can be improved by 24/7 automated data collection. Bias assessments revealed that of the 15 observational studies, 5 (33%) studies were of high quality, 5 (33%) studies were of moderate quality, and 5 (33%) studies were of low quality. Of the RCTs, 2 had a low risk of bias, while 1 had a high risk. CONCLUSIONS This systematic review provides critical insights into the potential benefits and challenges of using chatbots for medical history-taking. The included studies showed that chatbots can increase patient engagement, streamline data collection, and improve health care decision-making. For effective integration into clinical practice, it is crucial to design user-friendly interfaces, ensure robust data security, and maintain empathetic patient-physician interactions. Future research should focus on refining chatbot algorithms, improving their emotional intelligence, and extending their application to different health care settings to realize their full potential in modern medicine. TRIAL REGISTRATION PROSPERO CRD42023410312; www.crd.york.ac.uk/prospero.
Collapse
Affiliation(s)
- Michael Hindelang
- Department of Dermatology and Allergy, TUM School of Medicine and Health, Technical University of Munich, Munich, Germany
- Pettenkofer School of Public Health, Munich, Germany
- Institute for Medical Information Processing, Biometry and Epidemiology (IBE), Faculty of Medicine, Ludwig-Maximilian University, LMU, Munich, Germany
| | - Sebastian Sitaru
- Department of Dermatology and Allergy, TUM School of Medicine and Health, Technical University of Munich, Munich, Germany
| | - Alexander Zink
- Department of Dermatology and Allergy, TUM School of Medicine and Health, Technical University of Munich, Munich, Germany
- Division of Dermatology and Venereology, Department of Medicine Solna, Karolinska Institute, Stockholm, Sweden
| |
Collapse
|
37
|
Dal E, Srivastava A, Chigarira B, Hage Chehade C, Matthew Thomas V, Galarza Fortuna GM, Garg D, Ji R, Gebrael G, Agarwal N, Swami U, Li H. Effectiveness of ChatGPT 4.0 in Telemedicine-Based Management of Metastatic Prostate Carcinoma. Diagnostics (Basel) 2024; 14:1899. [PMID: 39272684 PMCID: PMC11394468 DOI: 10.3390/diagnostics14171899] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 07/29/2024] [Accepted: 08/22/2024] [Indexed: 09/15/2024] Open
Abstract
The recent rise in telemedicine, notably during the COVID-19 pandemic, highlights the potential of integrating artificial intelligence tools in healthcare. This study assessed the effectiveness of ChatGPT versus medical oncologists in the telemedicine-based management of metastatic prostate cancer. In this retrospective study, 102 patients who met inclusion criteria were analyzed to compare the competencies of ChatGPT and oncologists in telemedicine consultations. ChatGPT's role in pre-charting and determining the need for in-person consultations was evaluated. The primary outcome was the concordance between ChatGPT and oncologists in treatment decisions. Results showed a moderate concordance (Cohen's Kappa = 0.43, p < 0.001). The number of diagnoses made by both parties was not significantly different (median number of diagnoses: 5 vs. 5, p = 0.12). In conclusion, ChatGPT exhibited moderate agreement with oncologists in management via telemedicine, indicating the need for further research to explore its healthcare applications.
Collapse
Affiliation(s)
- Emre Dal
- Huntsman Cancer Institute, University of Utah, Salt Lake City, UT 84112, USA
| | - Ayana Srivastava
- Huntsman Cancer Institute, University of Utah, Salt Lake City, UT 84112, USA
| | - Beverly Chigarira
- Huntsman Cancer Institute, University of Utah, Salt Lake City, UT 84112, USA
| | - Chadi Hage Chehade
- Huntsman Cancer Institute, University of Utah, Salt Lake City, UT 84112, USA
| | | | | | - Diya Garg
- Huntsman Cancer Institute, University of Utah, Salt Lake City, UT 84112, USA
| | - Richard Ji
- Huntsman Cancer Institute, University of Utah, Salt Lake City, UT 84112, USA
| | - Georges Gebrael
- Huntsman Cancer Institute, University of Utah, Salt Lake City, UT 84112, USA
| | - Neeraj Agarwal
- Huntsman Cancer Institute, University of Utah, Salt Lake City, UT 84112, USA
| | - Umang Swami
- Huntsman Cancer Institute, University of Utah, Salt Lake City, UT 84112, USA
| | - Haoran Li
- Department of Medical Oncology, University of Kansas Cancer Center, Westwood, KS 66205, USA
| |
Collapse
|
38
|
Demirel S, Kahraman-Gokalp E, Gündüz U. From Optimism to Concern: Unveiling Sentiments and Perceptions Surrounding ChatGPT on Twitter. INTERNATIONAL JOURNAL OF HUMAN–COMPUTER INTERACTION 2024:1-23. [DOI: 10.1080/10447318.2024.2392964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2024] [Revised: 07/10/2024] [Accepted: 08/12/2024] [Indexed: 10/28/2024]
Affiliation(s)
- Sadettin Demirel
- Department of New Media and Communication, Faculty of Communication, Uskudar University, Istanbul, Turkey
| | | | - Uğur Gündüz
- Department of Journalism, Faculty of Communication, Istanbul University, Istanbul, Turkey
| |
Collapse
|
39
|
Liu C, Wei M, Qin Y, Zhang M, Jiang H, Xu J, Zhang Y, Hua Q, Hou Y, Dong Y, Xia S, Li N, Zhou J. Harnessing Large Language Models for Structured Reporting in Breast Ultrasound: A Comparative Study of Open AI (GPT-4.0) and Microsoft Bing (GPT-4). ULTRASOUND IN MEDICINE & BIOLOGY 2024:S0301-5629(24)00268-0. [PMID: 39138026 DOI: 10.1016/j.ultrasmedbio.2024.07.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Revised: 07/12/2024] [Accepted: 07/16/2024] [Indexed: 08/15/2024]
Abstract
OBJECTIVES To assess the capabilities of large language models (LLMs), including Open AI (GPT-4.0) and Microsoft Bing (GPT-4), in generating structured reports, the Breast Imaging Reporting and Data System (BI-RADS) categories, and management recommendations from free-text breast ultrasound reports. MATERIALS AND METHODS In this retrospective study, 100 free-text breast ultrasound reports from patients who underwent surgery between January and May 2023 were gathered. The capabilities of Open AI (GPT-4.0) and Microsoft Bing (GPT-4) to convert these unstructured reports into structured ultrasound reports were studied. The quality of structured reports, BI-RADS categories, and management recommendations generated by GPT-4.0 and Bing were evaluated by senior radiologists based on the guidelines. RESULTS Open AI (GPT-4.0) was better than Microsoft Bing (GPT-4) in terms of performance in generating structured reports (88% vs. 55%; p < 0.001), giving correct BI-RADS categories (54% vs. 47%; p = 0.013) and providing reasonable management recommendations (81% vs. 63%; p < 0.001). As the ability to predict benign and malignant characteristics, GPT-4.0 performed significantly better than Bing (AUC, 0.9317 vs. 0.8177; p < 0.001), while both performed significantly inferior to senior radiologists (AUC, 0.9763; both p < 0.001). CONCLUSION This study highlights the potential of LLMs, specifically Open AI (GPT-4.0), in converting unstructured breast ultrasound reports into structured ones, offering accurate diagnoses and providing reasonable recommendations.
Collapse
Affiliation(s)
- ChaoXu Liu
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - MinYan Wei
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Yu Qin
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - MeiXiang Zhang
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Huan Jiang
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - JiaLe Xu
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - YuNing Zhang
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Qing Hua
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - YiQing Hou
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - YiJie Dong
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - ShuJun Xia
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Ning Li
- Department of Ultrasound, Yunnan Kungang Hospital, The Seventh Affiliated Hospital of Dali University, Anning, Yunnan, China
| | - JianQiao Zhou
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
| |
Collapse
|
40
|
Liu Z, Zhang W. A qualitative analysis of Chinese higher education students' intentions and influencing factors in using ChatGPT: a grounded theory approach. Sci Rep 2024; 14:18100. [PMID: 39103453 PMCID: PMC11300642 DOI: 10.1038/s41598-024-65226-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 06/18/2024] [Indexed: 08/07/2024] Open
Abstract
The emergence of ChatGPT has significantly impacted the field of education. While much of the existing research has predominantly examined the theoretical implications of ChatGPT, there is a notable absence of empirical studies substantiating these claims. As pivotal stakeholders in education and primary users of ChatGPT, exploring the willingness and influencing factors of higher education students to use ChatGPT can offer valuable insights into the real-world needs of student users. This, in turn, can serve as a foundation for empowering education with intelligent technologies in the future. This study focuses specifically on the demographic of students in Chinese higher education who have utilized ChatGPT. Using semi-structured interviews and grounded theory methodology, we aim to comprehensively understand the extent to which students embrace new technologies. Our objective is to elucidate the behavioral inclinations and influencing factors of student users. The findings of this study will contribute practical insights for refining policy frameworks, expanding the dissemination of quality resources, optimizing and upgrading products for an enhanced user experience, and fostering higher-order thinking skills to adeptly navigate evolving technological landscapes. In conclusion, this research endeavors to bridge the gap between theoretical discussions and practical applications.
Collapse
Affiliation(s)
- Zhaoyang Liu
- Faculty of Education, Shaanxi Normal University, Xi'an, China.
| | - Wenlan Zhang
- Faculty of Education, Shaanxi Normal University, Xi'an, China.
| |
Collapse
|
41
|
Mese I. Tracing the Footprints of AI in Radiology Literature: A Detailed Analysis of Journal Abstracts. ROFO-FORTSCHR RONTG 2024; 196:843-849. [PMID: 38228155 DOI: 10.1055/a-2224-9230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2024]
Affiliation(s)
- Ismail Mese
- Department of Radiology, Istanbul Erenkoy Mental and Nervous Diseases Training and Research Hospital, Istanbul, Turkey
| |
Collapse
|
42
|
Zhui L, Fenghe L, Xuehu W, Qining F, Wei R. Ethical Considerations and Fundamental Principles of Large Language Models in Medical Education: Viewpoint. J Med Internet Res 2024; 26:e60083. [PMID: 38971715 PMCID: PMC11327620 DOI: 10.2196/60083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Accepted: 07/06/2024] [Indexed: 07/08/2024] Open
Abstract
This viewpoint article first explores the ethical challenges associated with the future application of large language models (LLMs) in the context of medical education. These challenges include not only ethical concerns related to the development of LLMs, such as artificial intelligence (AI) hallucinations, information bias, privacy and data risks, and deficiencies in terms of transparency and interpretability but also issues concerning the application of LLMs, including deficiencies in emotional intelligence, educational inequities, problems with academic integrity, and questions of responsibility and copyright ownership. This paper then analyzes existing AI-related legal and ethical frameworks and highlights their limitations with regard to the application of LLMs in the context of medical education. To ensure that LLMs are integrated in a responsible and safe manner, the authors recommend the development of a unified ethical framework that is specifically tailored for LLMs in this field. This framework should be based on 8 fundamental principles: quality control and supervision mechanisms; privacy and data protection; transparency and interpretability; fairness and equal treatment; academic integrity and moral norms; accountability and traceability; protection and respect for intellectual property; and the promotion of educational research and innovation. The authors further discuss specific measures that can be taken to implement these principles, thereby laying a solid foundation for the development of a comprehensive and actionable ethical framework. Such a unified ethical framework based on these 8 fundamental principles can provide clear guidance and support for the application of LLMs in the context of medical education. This approach can help establish a balance between technological advancement and ethical safeguards, thereby ensuring that medical education can progress without compromising the principles of fairness, justice, or patient safety and establishing a more equitable, safer, and more efficient environment for medical education.
Collapse
Affiliation(s)
- Li Zhui
- Department of Vascular Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Li Fenghe
- Department of Vascular Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Wang Xuehu
- Department of Vascular Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Fu Qining
- Department of Vascular Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Ren Wei
- Department of Vascular Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| |
Collapse
|
43
|
Ren D, Roland D. Arise robot overlords! A synergy of artificial intelligence in the evolution of scientific writing and publishing. Pediatr Res 2024; 96:576-578. [PMID: 38627589 DOI: 10.1038/s41390-024-03217-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 03/29/2024] [Indexed: 06/09/2024]
Affiliation(s)
- Dennis Ren
- Division of Emergency Medicine, Children's National Hospital, Washington, DC, WA, USA.
| | - Damian Roland
- SAPPHIRE Group, Population Health Sciences, Leicester University, Leicester, UK
- Paediatric Emergency Medicine Leicester Academic (PEMLA) Group, Children's Emergency Department, Leicester Royal Infirmary, Leicester, UK
| |
Collapse
|
44
|
Taylor WL, Cheng R, Weinblatt AI, Bergstein V, Long WJ. An Artificial Intelligence Chatbot is an Accurate and Useful Online Patient Resource Prior to Total Knee Arthroplasty. J Arthroplasty 2024; 39:S358-S362. [PMID: 38350517 DOI: 10.1016/j.arth.2024.02.005] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Accepted: 02/05/2024] [Indexed: 02/15/2024] Open
Abstract
BACKGROUND Online information is a useful resource for patients seeking advice on their orthopaedic care. While traditional websites provide responses to specific frequently asked questions (FAQs), sophisticated artificial intelligence tools may be able to provide the same information to patients in a more accessible manner. Chat Generative Pretrained Transformer (ChatGPT) is a powerful artificial intelligence chatbot that has been shown to effectively draw on its large reserves of information in a conversational context with a user. The purpose of this study was to assess the accuracy and reliability of ChatGPT-generated responses to FAQs regarding total knee arthroplasty. METHODS We distributed a survey that challenged arthroplasty surgeons to identify which of the 2 responses to FAQs on our institution's website was human-written and which was generated by ChatGPT. All questions were total knee arthroplasty-related. The second portion of the survey investigated the potential to further leverage ChatGPT to assist with translation and accessibility as a means to better meet the needs of our diverse patient population. RESULTS Surgeons correctly identified the ChatGPT-generated responses 4 out of 10 times on average (range: 0 to 7). No consensus was reached on any of the responses to the FAQs. Additionally, over 90% of our surgeons strongly encouraged the use of ChatGPT to more effectively accommodate the diverse patient populations that seek information from our hospital's online resources. CONCLUSIONS ChatGPT provided accurate, reliable answers to our website's FAQs. Surgeons also agreed that ChatGPT's ability to provide targeted, language-specific responses to FAQs would be of benefit to our diverse patient population.
Collapse
Affiliation(s)
- Walter L Taylor
- Department of Adult Reconstruction and Joint Replacement, Hospital for Special Surgery, New York, New York
| | - Ryan Cheng
- Department of Adult Reconstruction and Joint Replacement, Hospital for Special Surgery, New York, New York
| | - Aaron I Weinblatt
- Department of Adult Reconstruction and Joint Replacement, Hospital for Special Surgery, New York, New York
| | - Victoria Bergstein
- Department of Adult Reconstruction and Joint Replacement, Hospital for Special Surgery, New York, New York
| | - William J Long
- Department of Adult Reconstruction and Joint Replacement, Hospital for Special Surgery, New York, New York
| |
Collapse
|
45
|
Su Z, Tang G, Huang R, Qiao Y, Zhang Z, Dai X. Based on Medicine, The Now and Future of Large Language Models. Cell Mol Bioeng 2024; 17:263-277. [PMID: 39372551 PMCID: PMC11450117 DOI: 10.1007/s12195-024-00820-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2024] [Accepted: 09/08/2024] [Indexed: 10/08/2024] Open
Abstract
OBJECTIVES This review explores the potential applications of large language models (LLMs) such as ChatGPT, GPT-3.5, and GPT-4 in the medical field, aiming to encourage their prudent use, provide professional support, and develop accessible medical AI tools that adhere to healthcare standards. METHODS This paper examines the impact of technologies such as OpenAI's Generative Pre-trained Transformers (GPT) series, including GPT-3.5 and GPT-4, and other large language models (LLMs) in medical education, scientific research, clinical practice, and nursing. Specifically, it includes supporting curriculum design, acting as personalized learning assistants, creating standardized simulated patient scenarios in education; assisting with writing papers, data analysis, and optimizing experimental designs in scientific research; aiding in medical imaging analysis, decision-making, patient education, and communication in clinical practice; and reducing repetitive tasks, promoting personalized care and self-care, providing psychological support, and enhancing management efficiency in nursing. RESULTS LLMs, including ChatGPT, have demonstrated significant potential and effectiveness in the aforementioned areas, yet their deployment in healthcare settings is fraught with ethical complexities, potential lack of empathy, and risks of biased responses. CONCLUSION Despite these challenges, significant medical advancements can be expected through the proper use of LLMs and appropriate policy guidance. Future research should focus on overcoming these barriers to ensure the effective and ethical application of LLMs in the medical field.
Collapse
Affiliation(s)
- Ziqing Su
- Department of Neurosurgery, The First Affiliated Hospital of Anhui Medical University, 218 Jixi Road, Hefei, 230022 P.R. China
- Department of Clinical Medicine, The First Clinical College of Anhui Medical University, Hefei, 230022 P.R. China
| | - Guozhang Tang
- Department of Neurosurgery, The First Affiliated Hospital of Anhui Medical University, 218 Jixi Road, Hefei, 230022 P.R. China
- Department of Clinical Medicine, The Second Clinical College of Anhui Medical University, Hefei, 230032 Anhui P.R. China
| | - Rui Huang
- Department of Neurosurgery, The First Affiliated Hospital of Anhui Medical University, 218 Jixi Road, Hefei, 230022 P.R. China
- Department of Clinical Medicine, The First Clinical College of Anhui Medical University, Hefei, 230022 P.R. China
| | - Yang Qiao
- Department of Neurosurgery, The First Affiliated Hospital of Anhui Medical University, 218 Jixi Road, Hefei, 230022 P.R. China
| | - Zheng Zhang
- Department of Neurosurgery, The First Affiliated Hospital of Anhui Medical University, 218 Jixi Road, Hefei, 230022 P.R. China
- Department of Clinical Medicine, The First Clinical College of Anhui Medical University, Hefei, 230022 P.R. China
| | - Xingliang Dai
- Department of Neurosurgery, The First Affiliated Hospital of Anhui Medical University, 218 Jixi Road, Hefei, 230022 P.R. China
- Department of Research & Development, East China Institute of Digital Medical Engineering, Shangrao, 334000 P.R. China
| |
Collapse
|
46
|
Yokokawa D, Yanagita Y, Li Y, Yamashita S, Shikino K, Noda K, Tsukamoto T, Uehara T, Ikusaka M. For any disease a human can imagine, ChatGPT can generate a fake report. Diagnosis (Berl) 2024; 11:329-332. [PMID: 38386808 DOI: 10.1515/dx-2024-0007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 02/06/2024] [Indexed: 02/24/2024]
Affiliation(s)
- Daiki Yokokawa
- Department of General Medicine, 92154 Chiba University Hospital , Chiba, Japan
| | - Yasutaka Yanagita
- Department of General Medicine, 92154 Chiba University Hospital , Chiba, Japan
| | - Yu Li
- Department of General Medicine, 92154 Chiba University Hospital , Chiba, Japan
| | - Shiho Yamashita
- Department of General Medicine, 92154 Chiba University Hospital , Chiba, Japan
| | - Kiyoshi Shikino
- Department of General Medicine, 92154 Chiba University Hospital , Chiba, Japan
- Department of Community-oriented Medical Education, Chiba University Graduate School of Medicine, Chiba, Japan
| | - Kazutaka Noda
- Department of General Medicine, 92154 Chiba University Hospital , Chiba, Japan
| | - Tomoko Tsukamoto
- Department of General Medicine, 92154 Chiba University Hospital , Chiba, Japan
| | - Takanori Uehara
- Department of General Medicine, 92154 Chiba University Hospital , Chiba, Japan
| | - Masatomi Ikusaka
- Department of General Medicine, 92154 Chiba University Hospital , Chiba, Japan
| |
Collapse
|
47
|
Zhui L, Yhap N, Liping L, Zhengjie W, Zhonghao X, Xiaoshu Y, Hong C, Xuexiu L, Wei R. Impact of Large Language Models on Medical Education and Teaching Adaptations. JMIR Med Inform 2024; 12:e55933. [PMID: 39087590 PMCID: PMC11294775 DOI: 10.2196/55933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 04/25/2024] [Accepted: 06/08/2024] [Indexed: 08/02/2024] Open
Abstract
Unlabelled This viewpoint article explores the transformative role of large language models (LLMs) in the field of medical education, highlighting their potential to enhance teaching quality, promote personalized learning paths, strengthen clinical skills training, optimize teaching assessment processes, boost the efficiency of medical research, and support continuing medical education. However, the use of LLMs entails certain challenges, such as questions regarding the accuracy of information, the risk of overreliance on technology, a lack of emotional recognition capabilities, and concerns related to ethics, privacy, and data security. This article emphasizes that to maximize the potential of LLMs and overcome these challenges, educators must exhibit leadership in medical education, adjust their teaching strategies flexibly, cultivate students' critical thinking, and emphasize the importance of practical experience, thus ensuring that students can use LLMs correctly and effectively. By adopting such a comprehensive and balanced approach, educators can train health care professionals who are proficient in the use of advanced technologies and who exhibit solid professional ethics and practical skills, thus laying a strong foundation for these professionals to overcome future challenges in the health care sector.
Collapse
Affiliation(s)
- Li Zhui
- Department of Vascular Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Nina Yhap
- Department of General Surgery, Queen Elizabeth Hospital, St Michael, Barbados
| | - Liu Liping
- Department of Ultrasound, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Wang Zhengjie
- Department of Nuclear Medicine, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Xiong Zhonghao
- Department of Acupuncture and Moxibustion, Chongqing Traditional Chinese Medicine Hospital, Chongqing, China
| | - Yuan Xiaoshu
- Department of Anesthesia, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Cui Hong
- Department of Anesthesia, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Liu Xuexiu
- Department of Neonatology, Children’s Hospital of Chongqing Medical University, Chongqing, China
| | - Ren Wei
- Department of Vascular Surgery, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| |
Collapse
|
48
|
Chen J, Tao BK, Park S, Bovill E. Can ChatGPT Fool the Match? Artificial Intelligence Personal Statements for Plastic Surgery Residency Applications: A Comparative Study. Plast Surg (Oakv) 2024:22925503241264832. [PMID: 39553535 PMCID: PMC11561920 DOI: 10.1177/22925503241264832] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 04/30/2024] [Accepted: 05/21/2024] [Indexed: 11/19/2024] Open
Abstract
Introduction: Personal statements can be decisive in Canadian residency applications. With the rise in AI technology, ethical concerns regarding authenticity and originality become more pressing. This study explores the capability of ChatGPT in producing personal statements for plastic surgery residency that match the quality of statements written by successful applicants. Methods: ChatGPT was utilized to generate a cohort of personal statements for CaRMS (Canadian Residency Matching Service) to compare with previously successful Plastic Surgery applications. Each AI-generated and human-written statement was randomized and anonymized prior to assessment. Two retired members of the plastic surgery residency selection committee from the University of British Columbia, evaluated these on a 0 to 10 scale and provided a binary response judging whether each statement was AI or human written. Statistical analysis included Welch 2-sample t tests and Cohen's Kappa for agreement. Results: Twenty-two personal statements (11 AI-generated by ChatGPT and 11 human-written) were evaluated. The overall mean scores were 7.48 (SD 0.932) and 7.68 (SD 0.716), respectively, with no significant difference between AI and human groups (P = .4129). The average accuracy in distinguishing between human and AI letters was 65.9%. The Cohen's Kappa value was 0.374. Conclusions: ChatGPT can generate personal statements for plastic surgery residency applications with quality indistinguishable from human-written counterparts, as evidenced by the lack of significant scoring difference and moderate accuracy in discrimination by experienced surgeons. These findings highlight the evolving role of AI and the need for updated evaluative criteria or guidelines in the residency application process.
Collapse
Affiliation(s)
- Jeffrey Chen
- Michael G. DeGroote School of Medicine, Faculty of Health Sciences, McMaster University, Hamilton, Ontario, Canada
| | - Brendan K. Tao
- Faculty of Medicine, University of British Columbia, Vancouver, British Columbia, Canada
| | - Shihyun Park
- School of Pharmacy, University of Waterloo, Kitchener, Ontario, Canada
| | - Esta Bovill
- Division of Plastic Surgery, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
49
|
Hassanipour S, Nayak S, Bozorgi A, Keivanlou MH, Dave T, Alotaibi A, Joukar F, Mellatdoust P, Bakhshi A, Kuriyakose D, Polisetty LD, Chimpiri M, Amini-Salehi E. The Ability of ChatGPT in Paraphrasing Texts and Reducing Plagiarism: A Descriptive Analysis. JMIR MEDICAL EDUCATION 2024; 10:e53308. [PMID: 38989841 PMCID: PMC11250043 DOI: 10.2196/53308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 01/03/2024] [Accepted: 05/01/2024] [Indexed: 07/12/2024]
Abstract
Background The introduction of ChatGPT by OpenAI has garnered significant attention. Among its capabilities, paraphrasing stands out. Objective This study aims to investigate the satisfactory levels of plagiarism in the paraphrased text produced by this chatbot. Methods Three texts of varying lengths were presented to ChatGPT. ChatGPT was then instructed to paraphrase the provided texts using five different prompts. In the subsequent stage of the study, the texts were divided into separate paragraphs, and ChatGPT was requested to paraphrase each paragraph individually. Lastly, in the third stage, ChatGPT was asked to paraphrase the texts it had previously generated. Results The average plagiarism rate in the texts generated by ChatGPT was 45% (SD 10%). ChatGPT exhibited a substantial reduction in plagiarism for the provided texts (mean difference -0.51, 95% CI -0.54 to -0.48; P<.001). Furthermore, when comparing the second attempt with the initial attempt, a significant decrease in the plagiarism rate was observed (mean difference -0.06, 95% CI -0.08 to -0.03; P<.001). The number of paragraphs in the texts demonstrated a noteworthy association with the percentage of plagiarism, with texts consisting of a single paragraph exhibiting the lowest plagiarism rate (P<.001). Conclusions Although ChatGPT demonstrates a notable reduction of plagiarism within texts, the existing levels of plagiarism remain relatively high. This underscores a crucial caution for researchers when incorporating this chatbot into their work.
Collapse
Affiliation(s)
- Soheil Hassanipour
- Gastrointestinal and Liver Diseases Research Center, Guilan University of Medical Sciences, Rasht, Iran
| | - Sandeep Nayak
- Department of Internal Medicine, Yale New Haven Health Bridgeport Hospital, Bridgeport, NY, United States
| | - Ali Bozorgi
- Tehran Heart Center, Tehran University of Medical Sciences, Tehran, Iran
| | | | - Tirth Dave
- Department of Internal Medicine, Bukovinian State Medical University, Chernivtsi, Ukraine
| | | | - Farahnaz Joukar
- Gastrointestinal and Liver Diseases Research Center, Guilan University of Medical Sciences, Rasht, Iran
| | - Parinaz Mellatdoust
- Dipartimento di Elettronica Informazione Bioingegneria, Politecnico di Milano, Milan, Italy
| | - Arash Bakhshi
- Gastrointestinal and Liver Diseases Research Center, Guilan University of Medical Sciences, Rasht, Iran
| | - Dona Kuriyakose
- Department of Internal Medicine, St. Joseph's Mission Hospital, Anchal, Kollam District Kerala, India
| | - Lakshmi D Polisetty
- Department of Internal Medicine, Yale New Haven Health Bridgeport Hospital, Bridgeport, NY, United States
| | | | - Ehsan Amini-Salehi
- Gastrointestinal and Liver Diseases Research Center, Guilan University of Medical Sciences, Rasht, Iran
| |
Collapse
|
50
|
Nakaura T, Ito R, Ueda D, Nozaki T, Fushimi Y, Matsui Y, Yanagawa M, Yamada A, Tsuboyama T, Fujima N, Tatsugami F, Hirata K, Fujita S, Kamagata K, Fujioka T, Kawamura M, Naganawa S. The impact of large language models on radiology: a guide for radiologists on the latest innovations in AI. Jpn J Radiol 2024; 42:685-696. [PMID: 38551772 PMCID: PMC11217134 DOI: 10.1007/s11604-024-01552-0] [Citation(s) in RCA: 22] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 02/21/2024] [Indexed: 07/03/2024]
Abstract
The advent of Deep Learning (DL) has significantly propelled the field of diagnostic radiology forward by enhancing image analysis and interpretation. The introduction of the Transformer architecture, followed by the development of Large Language Models (LLMs), has further revolutionized this domain. LLMs now possess the potential to automate and refine the radiology workflow, extending from report generation to assistance in diagnostics and patient care. The integration of multimodal technology with LLMs could potentially leapfrog these applications to unprecedented levels.However, LLMs come with unresolved challenges such as information hallucinations and biases, which can affect clinical reliability. Despite these issues, the legislative and guideline frameworks have yet to catch up with technological advancements. Radiologists must acquire a thorough understanding of these technologies to leverage LLMs' potential to the fullest while maintaining medical safety and ethics. This review aims to aid in that endeavor.
Collapse
Affiliation(s)
- Takeshi Nakaura
- Department of Central Radiology, Kumamoto University Hospital, Honjo 1-1-1, Kumamoto, 860-8556, Japan.
| | - Rintaro Ito
- Department of Radiology, Nagoya University Graduate School of Medicine, Nagoya, Aichi, Japan
| | - Daiju Ueda
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, 1‑4‑3 Asahi‑Machi, Abeno‑ku, Osaka, 545‑8585, Japan
| | - Taiki Nozaki
- Department of Radiology, Keio University School of Medicine, Shinjuku‑ku, Tokyo, Japan
| | - Yasutaka Fushimi
- Department of Diagnostic Imaging and Nuclear Medicine, Kyoto University Graduate School of Medicine, Sakyoku, Kyoto, Japan
| | - Yusuke Matsui
- Department of Radiology, Faculty of Medicine, Dentistry and Pharmaceutical Sciences, Okayama University, Kita‑ku, Okayama, Japan
| | - Masahiro Yanagawa
- Department of Radiology, Osaka University Graduate School of Medicine, Suita City, Osaka, Japan
| | - Akira Yamada
- Department of Radiology, Shinshu University School of Medicine, Matsumoto, Nagano, Japan
| | - Takahiro Tsuboyama
- Department of Radiology, Osaka University Graduate School of Medicine, Suita City, Osaka, Japan
| | - Noriyuki Fujima
- Department of Diagnostic and Interventional Radiology, Hokkaido University Hospital, Sapporo, Japan
| | - Fuminari Tatsugami
- Department of Diagnostic Radiology, Hiroshima University, Minami‑ku, Hiroshima, Japan
| | - Kenji Hirata
- Department of Diagnostic Imaging, Graduate School of Medicine, Hokkaido University, Kita‑ku, Sapporo, Hokkaido, Japan
| | - Shohei Fujita
- Department of Radiology, University of Tokyo, Bunkyo‑ku, Tokyo, Japan
| | - Koji Kamagata
- Department of Radiology, Juntendo University Graduate School of Medicine, Bunkyo‑ku, Tokyo, Japan
| | - Tomoyuki Fujioka
- Department of Diagnostic Radiology, Tokyo Medical and Dental University, Bunkyo‑ku, Tokyo, Japan
| | - Mariko Kawamura
- Department of Radiology, Nagoya University Graduate School of Medicine, Nagoya, Aichi, Japan
| | - Shinji Naganawa
- Department of Radiology, Nagoya University Graduate School of Medicine, Nagoya, Aichi, Japan
| |
Collapse
|