1
|
Ozmen BB, Mathur P. Evidence-based artificial intelligence: Implementing retrieval-augmented generation models to enhance clinical decision support in plastic surgery. J Plast Reconstr Aesthet Surg 2025; 104:414-416. [PMID: 40174259 DOI: 10.1016/j.bjps.2025.03.053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2025] [Revised: 03/24/2025] [Accepted: 03/26/2025] [Indexed: 04/04/2025]
Abstract
The rapid advancement of large language models (LLMs) has generated significant enthusiasm within healthcare, especially in supporting clinical decision-making and patient management. However, inherent limitations including hallucinations, outdated clinical context, and unreliable references pose serious concerns for their clinical utility. Retrieval-Augmented Generation (RAG) models address these limitations by integrating validated, curated medical literature directly into AI workflows, significantly enhancing the accuracy, relevance, and transparency of generated outputs. This viewpoint discusses how RAG frameworks can specifically benefit plastic and reconstructive surgery by providing contextually accurate, evidence-based, and clinically grounded support for decision-making. Potential clinical applications include clinical decision support, efficient evidence synthesis, customizable patient education, informed consent materials, multilingual capabilities, and structured surgical documentation. By querying specialized databases that incorporate contemporary guidelines and literature, RAG models can markedly reduce inaccuracies and increase the reliability of AI-generated responses. However, the implementation of RAG technology demands rigorous database curation, regular updating with guidelines from surgical societies, and ongoing validation to maintain clinical relevance. Addressing challenges related to data privacy, governance, ethical considerations, and user training remains critical for successful clinical adoption. In conclusion, RAG models represent a significant advancement in overcoming traditional LLM limitations, promoting transparency and clinical accuracy with great potential for plastic surgery. Plastic surgeons and researchers are encouraged to explore and integrate these innovative generative AI frameworks to enhance patient care, surgical outcomes, communication, documentation quality, and education.
Collapse
Affiliation(s)
- Berk B Ozmen
- Department of Plastic Surgery, Cleveland Clinic, Cleveland, OH, USA.
| | - Piyush Mathur
- Department of General Anesthesiology, Cleveland Clinic, Cleveland, OH, USA; BrainXAI ReSearch, BrainX LLC, Cleveland, OH, USA
| |
Collapse
|
2
|
Jaber MJ, Al-Bashaireh AM, Kouri O, Aldiqs MA, Alqudah OM, Khraisat OM, Bindahmsh AA, Alshodukhi AM, Almutairi AO, Hakeem NA. Development and Validation of a Workflow Instrument to Evaluate the Success of Electronic Health Records Implementation from a Nursing Perspective: An Exploratory and Descriptive Study. GLOBAL JOURNAL ON QUALITY AND SAFETY IN HEALTHCARE 2025; 8:15-22. [PMID: 39935718 PMCID: PMC11808855 DOI: 10.36401/jqsh-24-16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 07/22/2024] [Accepted: 07/30/2024] [Indexed: 02/13/2025]
Abstract
Introduction Electronic medical records (EMR) have been recognized as practical tools for the improvement of the quality and safety of healthcare despite their occasional use in hospitals worldwide. Epic is an integrated software suite with functionality ranging from patient administration through systems for healthcare providers to billing systems, integration to the primary health sector, and a facility for granting patients access to their data. The implementation process is complicated; creating effective methods requires understanding users' attitudes about these information technologies. This study aimed to develop and validate a questionnaire that measures the efficacy of using workflow during the EMR (Epic) implementation. Furthermore, it describes the nurses' views on the use of quality and satisfaction of workflow. Methods Following a literature review, an initial pool of 57 items was generated based on the following three primary constructs: use, quality, and user satisfaction with the tool's workflow. Internal consistency reliability was assessed by calculating Cronbach's alpha and correlation coefficients for construct validity. Results The final scale comprised 53 items corresponding to the following five distinct factors: use of workflow, information quality, service quality, use of EMR, and user satisfaction and the influence of workflow on clinical care. The full scale was assessed, and Cronbach's alpha of 0.95 was found. The construct validity was assessed using the Kaiser-Meyer-Olkin measure of sampling adequacy and Bartlett's Test of Sphericity (0.976). Construct validity was tested twice using Exploratory Factor Analysis-Principal Component Analysis. Conclusion The use of workflow, quality of information, quality of service, use of EMR, and user satisfaction scale have good reliability and validity and can be used to implement technology in healthcare.
Collapse
Affiliation(s)
- Mohammad J. Jaber
- Department of Nursing, Emergency Administration, King Fahad Medical City, Riyadh, KSA
| | | | - Osama Kouri
- Faculty of Nursing, Yarmouk University, Amman, Jordan
| | | | - Ola M. Alqudah
- Faculty of Nursing, Al-Ahliyya Amman University, Amman, Jordan
| | | | - Alanoud A. Bindahmsh
- Department of Nursing, Emergency Administration, King Fahad Medical City, Riyadh, KSA
| | - Abeer M. Alshodukhi
- Department of Nursing, Emergency Administration, King Fahad Medical City, Riyadh, KSA
| | - Amer O. Almutairi
- Department of Nursing, Emergency Administration, King Fahad Medical City, Riyadh, KSA
| | - Nevin A. Hakeem
- Quality & Accreditation Administration, King Fahad Medical City, Riyadh, KSA
| |
Collapse
|
3
|
Murad MH, Vaa Stelling BE, West CP, Hasan B, Simha S, Saadi S, Firwana M, Viola KE, Prokop LJ, Nayfeh T, Wang Z. Measuring Documentation Burden in Healthcare. J Gen Intern Med 2024; 39:2837-2848. [PMID: 39073484 PMCID: PMC11534919 DOI: 10.1007/s11606-024-08956-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Accepted: 07/17/2024] [Indexed: 07/30/2024]
Abstract
BACKGROUND The enactment of the Health Information Technology for Economic and Clinical Health Act and the wide adoption of electronic health record (EHR) systems have ushered in increasing documentation burden, frequently cited as a key factor affecting the work experience of healthcare professionals and a contributor to burnout. This systematic review aims to identify and characterize measures of documentation burden. METHODS We integrated discussions with Key Informants and a comprehensive search of the literature, including MEDLINE, Embase, Scopus, and gray literature published between 2010 and 2023. Data were narratively and thematically synthesized. RESULTS We identified 135 articles about measuring documentation burden. We classified measures into 11 categories: overall time spent in EHR, activities related to clinical documentation, inbox management, time spent in clinical review, time spent in orders, work outside work/after hours, administrative tasks (billing and insurance related), fragmentation of workflow, measures of efficiency, EHR activity rate, and usability. The most common source of data for most measures was EHR usage logs. Direct tracking such as through time-motion analysis was fairly uncommon. Measures were developed and applied across various settings and populations, with physicians and nurses in the USA being the most frequently represented healthcare professionals. Evidence of validity of these measures was limited and incomplete. Data on the appropriateness of measures in terms of scalability, feasibility, or equity across various contexts were limited. The physician perspective was the most robustly captured and prominently focused on increased stress and burnout. DISCUSSION Numerous measures for documentation burden are available and have been tested in a variety of settings and contexts. However, most are one-dimensional, do not capture various domains of this construct, and lack robust validity evidence. This report serves as a call to action highlighting an urgent need for measure development that represents diverse clinical contexts and support future interventions.
Collapse
Affiliation(s)
- M Hassan Murad
- Mayo Clinic Evidence-Based Practice Center, Mayo Clinic Robert D. and Patricia E. Kern Center for the Science of Healthcare Delivery, Mayo Clinic, Rochester, MN, USA.
| | - Brianna E Vaa Stelling
- Division of Community Internal Medicine, Department of Medicine, Mayo Clinic, Rochester, MN, USA
| | - Colin P West
- Division of General Internal Medicine, Department of Medicine, Mayo Clinic, Rochester, MN, USA
| | - Bashar Hasan
- Mayo Clinic Evidence-Based Practice Center, Mayo Clinic Robert D. and Patricia E. Kern Center for the Science of Healthcare Delivery, Mayo Clinic, Rochester, MN, USA
| | - Suvyaktha Simha
- Mayo Clinic Evidence-Based Practice Center, Mayo Clinic Robert D. and Patricia E. Kern Center for the Science of Healthcare Delivery, Mayo Clinic, Rochester, MN, USA
| | - Samer Saadi
- Mayo Clinic Evidence-Based Practice Center, Mayo Clinic Robert D. and Patricia E. Kern Center for the Science of Healthcare Delivery, Mayo Clinic, Rochester, MN, USA
| | - Mohammed Firwana
- Mayo Clinic Evidence-Based Practice Center, Mayo Clinic Robert D. and Patricia E. Kern Center for the Science of Healthcare Delivery, Mayo Clinic, Rochester, MN, USA
| | - Kelly E Viola
- Mayo Clinic Evidence-Based Practice Center, Mayo Clinic Robert D. and Patricia E. Kern Center for the Science of Healthcare Delivery, Mayo Clinic, Rochester, MN, USA
| | | | - Tarek Nayfeh
- Mayo Clinic Evidence-Based Practice Center, Mayo Clinic Robert D. and Patricia E. Kern Center for the Science of Healthcare Delivery, Mayo Clinic, Rochester, MN, USA
| | - Zhen Wang
- Mayo Clinic Evidence-Based Practice Center, Mayo Clinic Robert D. and Patricia E. Kern Center for the Science of Healthcare Delivery, Mayo Clinic, Rochester, MN, USA
| |
Collapse
|
4
|
Cohen SA, Brant A, Fisher AC, Pershing S, Do D, Pan C. Dr. Google vs. Dr. ChatGPT: Exploring the Use of Artificial Intelligence in Ophthalmology by Comparing the Accuracy, Safety, and Readability of Responses to Frequently Asked Patient Questions Regarding Cataracts and Cataract Surgery. Semin Ophthalmol 2024; 39:472-479. [PMID: 38516983 DOI: 10.1080/08820538.2024.2326058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 02/25/2024] [Accepted: 02/27/2024] [Indexed: 03/23/2024]
Abstract
PURPOSE Patients are using online search modalities to learn about their eye health. While Google remains the most popular search engine, the use of large language models (LLMs) like ChatGPT has increased. Cataract surgery is the most common surgical procedure in the US, and there is limited data on the quality of online information that populates after searches related to cataract surgery on search engines such as Google and LLM platforms such as ChatGPT. We identified the most common patient frequently asked questions (FAQs) about cataracts and cataract surgery and evaluated the accuracy, safety, and readability of the answers to these questions provided by both Google and ChatGPT. We demonstrated the utility of ChatGPT in writing notes and creating patient education materials. METHODS The top 20 FAQs related to cataracts and cataract surgery were recorded from Google. Responses to the questions provided by Google and ChatGPT were evaluated by a panel of ophthalmologists for accuracy and safety. Evaluators were also asked to distinguish between Google and LLM chatbot answers. Five validated readability indices were used to assess the readability of responses. ChatGPT was instructed to generate operative notes, post-operative instructions, and customizable patient education materials according to specific readability criteria. RESULTS Responses to 20 patient FAQs generated by ChatGPT were significantly longer and written at a higher reading level than responses provided by Google (p < .001), with an average grade level of 14.8 (college level). Expert reviewers were correctly able to distinguish between a human-reviewed and chatbot generated response an average of 31% of the time. Google answers contained incorrect or inappropriate material 27% of the time, compared with 6% of LLM generated answers (p < .001). When expert reviewers were asked to compare the responses directly, chatbot responses were favored (66%). CONCLUSIONS When comparing the responses to patients' cataract FAQs provided by ChatGPT and Google, practicing ophthalmologists overwhelming preferred ChatGPT responses. LLM chatbot responses were less likely to contain inaccurate information. ChatGPT represents a viable information source for eye health for patients with higher health literacy. ChatGPT may also be used by ophthalmologists to create customizable patient education materials for patients with varying health literacy.
Collapse
Affiliation(s)
- Samuel A Cohen
- Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, USA
| | - Arthur Brant
- Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, USA
| | - Ann Caroline Fisher
- Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, USA
| | - Suzann Pershing
- Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, USA
| | - Diana Do
- Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, USA
| | - Carolyn Pan
- Byers Eye Institute, Stanford University School of Medicine, Stanford, CA, USA
| |
Collapse
|
5
|
Cohen SA, Fisher AC, Xu BY, Song BJ. Comparing the Accuracy and Readability of Glaucoma-related Question Responses and Educational Materials by Google and ChatGPT. J Curr Glaucoma Pract 2024; 18:110-116. [PMID: 39575130 PMCID: PMC11576343 DOI: 10.5005/jp-journals-10078-1448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2024] [Accepted: 09/17/2024] [Indexed: 11/24/2024] Open
Abstract
Aim and background Patients are increasingly turning to the internet to learn more about their ocular disease. In this study, we sought (1) to compare the accuracy and readability of Google and ChatGPT responses to patients' glaucoma-related frequently asked questions (FAQs) and (2) to evaluate ChatGPT's capacity to improve glaucoma patient education materials by accurately reducing the grade level at which they are written. Materials and methods We executed a Google search to identify the three most common FAQs related to 10 search terms associated with glaucoma diagnosis and treatment. Each of the 30 FAQs was inputted into both Google and ChatGPT and responses were recorded. The accuracy of responses was evaluated by three glaucoma specialists while readability was assessed using five validated readability indices. Subsequently, ChatGPT was instructed to generate patient education materials at specific reading levels to explain seven glaucoma procedures. The accuracy and readability of procedural explanations were measured. Results ChatGPT responses to glaucoma FAQs were significantly more accurate than Google responses (97 vs 77% accuracy, respectively, p < 0.001). ChatGPT responses were also written at a significantly higher reading level (grade 14.3 vs 9.4, respectively, p < 0.001). When instructed to revise glaucoma procedural explanations to improve understandability, ChatGPT reduced the average reading level of educational materials from grade 16.6 (college level) to grade 9.4 (high school level) (p < 0.001) without reducing the accuracy of procedural explanations. Conclusion ChatGPT is more accurate than Google search when responding to glaucoma patient FAQs. ChatGPT successfully reduced the reading level of glaucoma procedural explanations without sacrificing accuracy, with implications for the future of customized patient education for patients with varying health literacy. Clinical significance Our study demonstrates the utility of ChatGPT for patients seeking information about glaucoma and for physicians when creating unique patient education materials at reading levels that optimize understanding by patients. An enhanced patient understanding of glaucoma may lead to informed decision-making and improve treatment compliance. How to cite this article Cohen SA, Fisher AC, Xu BY, et al. Comparing the Accuracy and Readability of Glaucoma-related Question Responses and Educational Materials by Google and ChatGPT. J Curr Glaucoma Pract 2024;18(3):110-116.
Collapse
Affiliation(s)
- Samuel A Cohen
- Department of Ophthalmology, UCLA Stein Eye Institute, Los Angeles, California, United States
| | - Ann C Fisher
- Department of Ophthalmology Byers Eye Institute at Stanford, Stanford, California, United States
| | - Benjamin Y Xu
- Department of Ophthalmology, USC Roski Eye Institute, Los Angeles, California, United States
| | - Brian J Song
- Department of Ophthalmology, USC Roski Eye Institute, Los Angeles, California, United States
| |
Collapse
|
6
|
Franke A, Weiland B, Bučkova M, Bräuer C, Lauer G, Leonhardt H. Cost minimization analysis of indication-specific osteosynthesis material in oral and maxillofacial surgery. Oral Maxillofac Surg 2024; 28:179-184. [PMID: 36331629 PMCID: PMC10914910 DOI: 10.1007/s10006-022-01126-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 10/23/2022] [Indexed: 06/16/2023]
Abstract
PURPOSE Following the introduction of the Regulation (EU) 2017/745 by the European Parliament, any bioactive substance or surgical implant introduced into the human body must be documented. The regulation requires any implant to be traced back to the manufacturer. Lot numbers need to be available for every single medical implant. Also, the manufacturer is required by law to provide implants individually packaged and sterilized. Previously, model tray systems (MOS tray) were used for osteosynthesis in oral and maxillofacial surgery, in which the individual implants could not be registered separately. The new regulation made it impossible to use such processes during surgery anymore and a need for a change in the medical practice surged. We examined a possible solution for the new legislation. The aim of this prospective cohort study is to analyze the MOS tray systems to osteosynthesis materials prepackaged in sets. We record and evaluate parameters such as surgical time and documentation time. We perform a short cost analysis of our clinic. The primary aim is to determine how much time is gained or lost by the mandatory increased patient safety. The secondary aim is to describe change in costs. METHODS Patients that underwent standard surgical procedures in the clinic of oral and maxillofacial surgery of the faculty hospital Carl Gustav Carus in Dresden were included. We chose open reduction and internal fixation (ORIF) of anterior mandibular corpus fractures as well as mandibular advancement by means of bilateral sagittal split osteotomies (BSSO) as standardized procedures. Both of these procedures require two osteosynthesis plates and at least four screws for each plate. MOS trays were compared to prepackaged sterilized sets. The sets include a drill bit, two plates, and eight 5-mm screws. A total number of 40 patients were examined. We allocated 20 patients to the ORIF group and the other 20 patients to the BSSO group. Each group was evenly subdivided into a MOS tray group and a prepackaged group. Parameters such as the incision-suture time (IST) as well as the documentation time (DT) by the operating room (OR) staff to complete documentation for the implants are the main focus of investigation. RESULTS For open reduction, the incision-suture time was significantly different in favor of the MOS tray (p < 0.05). There was no difference in the BSSO groups. However, we observed a significantly different (p < 0.01) documentation time advantage for the prepackaged sets in both the ORIF and BSSO groups. On top of that, we find that by using the prepackaged kits, we are able to reduce sterilization costs by €11.53 per size-reduced container. Also, there is also a total cut of costs of €38.90 and €43.70, respectively, per standardized procedure for implant material. CONCLUSIONS By law, a change in the method of approaching surgery is necessary. For standardized procedures, the right choice of implants can lead to a reduction of documentation time and costs for implant material, sterilization, as well as utilizing less instruments. This in turn leads to lower costs for perioperative processing as well as provision of state-of-the-art implant quality implementing higher patient security.
Collapse
Affiliation(s)
- Adrian Franke
- Department of Oral and Maxillofacial Surgery, University Hospital Carl Gustav Carus, 01304, Dresden, Germany.
- Klinik und Poliklinik für Mund-, Kiefer- und Gesichtschirurgie, Universitätsklinikum Carl Gustav Carus, an der Technischen Universität Dresden, 01304, Dresden, Germany.
| | - Bernhard Weiland
- Department of Oral and Maxillofacial Surgery, University Hospital Carl Gustav Carus, 01304, Dresden, Germany
| | - Michaela Bučkova
- Department of Oral and Maxillofacial Surgery, University Hospital Carl Gustav Carus, 01304, Dresden, Germany
| | - Christian Bräuer
- Department of Oral and Maxillofacial Surgery, University Hospital Carl Gustav Carus, 01304, Dresden, Germany
| | - Günter Lauer
- Department of Oral and Maxillofacial Surgery, University Hospital Carl Gustav Carus, 01304, Dresden, Germany
| | - Henry Leonhardt
- Department of Oral and Maxillofacial Surgery, University Hospital Carl Gustav Carus, 01304, Dresden, Germany
| |
Collapse
|
7
|
Gandhi P A, Goel K, Gupta M, Singh A. Effect of digitization of medical case files on doctor patient relationship in an Out Patient Department setting of Northern India: A comparative study. INDIAN JOURNAL OF COMMUNITY HEALTH 2022. [DOI: 10.47203/ijch.2022.v34i04.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Background: Digitization of health records and health delivery processes in health care settings may have an impact on the Patient-Physician communication, wait times, that affect the overall patient satisfaction with the health care services. Aim & Objective: We ascertained the effect of digitization of medical case files on the doctor patient relationship (DPR) domain of patient satisfaction at an urban primary health center in India. Settings and Design: Comparative, cross-sectional study in primary health centres. Methods and Material: The patient satisfaction was compared between the patients attending the Public Health Dispensary (PHD) that uses digitized medical case file system and a Civil Dispensary (CD) which follows the conventional paper based medical records, using a Patient Satisfaction Questionnaire (PSQ). Statistical analysis used: Univariate analysis was done by chi-square test and adjusted analysis was done by multiple linear regression. Results: Patient satisfaction in DPR was found to be same between the digitized medical case files based and conventional OPD (p=0.453). Significantly higher overall patient satisfaction was reported in the conventional paper based OPD than the digitized OPD (p<0.001). Conclusions: Patient satisfaction towards the doctor-patient relationship (DPR) was same between paper based OPD and the digitized medical case files based OPD.
Collapse
|
8
|
Re: “Evaluation of Electronic Health Record Implementation in an Academic Oculoplastics Practice”. Ophthalmic Plast Reconstr Surg 2020; 36:622-623. [DOI: 10.1097/iop.0000000000001832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|