1
|
Phipps B, Hadoux X, Sheng B, Campbell JP, Liu TYA, Keane PA, Cheung CY, Chung TY, Wong TY, van Wijngaarden P. AI image generation technology in ophthalmology: Use, misuse and future applications. Prog Retin Eye Res 2025; 106:101353. [PMID: 40107410 DOI: 10.1016/j.preteyeres.2025.101353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Revised: 03/12/2025] [Accepted: 03/13/2025] [Indexed: 03/22/2025]
Abstract
BACKGROUND AI-powered image generation technology holds the potential to reshape medical practice, yet it remains an unfamiliar technology for both medical researchers and clinicians alike. Given the adoption of this technology relies on clinician understanding and acceptance, we sought to demystify its use in ophthalmology. To this end, we present a literature review on image generation technology in ophthalmology, examining both its theoretical applications and future role in clinical practice. METHODS First, we consider the key model designs used for image synthesis, including generative adversarial networks, autoencoders, and diffusion models. We then perform a survey of the literature for image generation technology in ophthalmology prior to September 2024, presenting both the type of model used and its clinical application. Finally, we discuss the limitations of this technology, the risks of its misuse and the future directions of research in this field. RESULTS Applications of this technology include improving AI diagnostic models, inter-modality image transformation, more accurate treatment and disease prognostication, image denoising, and individualised education. Key barriers to its adoption include bias in generative models, risks to patient data security, computational and logistical barriers to development, challenges with model explainability, inconsistent use of validation metrics between studies and misuse of synthetic images. Looking forward, researchers are placing a further emphasis on clinically grounded metrics, the development of image generation foundation models and the implementation of methods to ensure data provenance. CONCLUSION Compared to other medical applications of AI, image generation is still in its infancy. Yet, it holds the potential to revolutionise ophthalmology across research, education and clinical practice. This review aims to guide ophthalmic researchers wanting to leverage this technology, while also providing an insight for clinicians on how it may change ophthalmic practice in the future.
Collapse
Affiliation(s)
- Benjamin Phipps
- Centre for Eye Research Australia, Royal Victorian Eye and Ear Hospital, East Melbourne, 3002, VIC, Australia; Ophthalmology, Department of Surgery, University of Melbourne, Parkville, 3010, VIC, Australia.
| | - Xavier Hadoux
- Centre for Eye Research Australia, Royal Victorian Eye and Ear Hospital, East Melbourne, 3002, VIC, Australia; Ophthalmology, Department of Surgery, University of Melbourne, Parkville, 3010, VIC, Australia
| | - Bin Sheng
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - J Peter Campbell
- Department of Ophthalmology, Casey Eye Institute, Oregon Health and Science University, Portland, USA
| | - T Y Alvin Liu
- Retina Division, Wilmer Eye Institute, Johns Hopkins University, Baltimore, MD, 21287, USA
| | - Pearse A Keane
- NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust, London, UK; Institute of Ophthalmology, University College London, London, UK
| | - Carol Y Cheung
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, 999077, China
| | - Tham Yih Chung
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Centre for Innovation and Precision Eye Health, Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore and National University Health System, Singapore; Singapore Eye Research Institute, Singapore National Eye Centre, Singapore; Eye Academic Clinical Program (Eye ACP), Duke NUS Medical School, Singapore
| | - Tien Y Wong
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore; Tsinghua Medicine, Tsinghua University, Beijing, China; Beijing Visual Science and Translational Eye Research Institute, Beijing Tsinghua Changgung Hospital, Beijing, China
| | - Peter van Wijngaarden
- Centre for Eye Research Australia, Royal Victorian Eye and Ear Hospital, East Melbourne, 3002, VIC, Australia; Ophthalmology, Department of Surgery, University of Melbourne, Parkville, 3010, VIC, Australia; Florey Institute of Neuroscience & Mental Health, Parkville, VIC, Australia
| |
Collapse
|
2
|
Haider SA, Prabha S, Gomez-Cabello CA, Borna S, Pressman SM, Genovese A, Trabilsy M, Galvao A, Aziz KT, Murray PM, Parte Y, Yu Y, Tao C, Forte AJ. A Validity Analysis of Text-to-Image Generative Artificial Intelligence Models for Craniofacial Anatomy Illustration. J Clin Med 2025; 14:2136. [PMID: 40217587 PMCID: PMC11989924 DOI: 10.3390/jcm14072136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2025] [Revised: 03/18/2025] [Accepted: 03/18/2025] [Indexed: 04/14/2025] Open
Abstract
Background: Anatomically accurate illustrations are imperative in medical education, serving as crucial tools to facilitate comprehension of complex anatomical structures. While traditional illustration methods involving human artists remain the gold standard, the rapid advancement of Generative Artificial Intelligence (GAI) models presents a new opportunity to automate and accelerate this process. This study evaluated the potential of GAI models to produce craniofacial anatomy illustrations for educational purposes. Methods: Four GAI models, including Midjourney v6.0, DALL-E 3, Gemini Ultra 1.0, and Stable Diffusion 2.0 were used to generate 736 images across multiple views of surface anatomy, bones, muscles, blood vessels, and nerves of the cranium in both oil painting and realistic photograph styles. Four reviewers evaluated the images for anatomical detail, aesthetic quality, usability, and cost-effectiveness. Inter-rater reliability analysis assessed evaluation consistency. Results: Midjourney v6.0 scored highest for aesthetic quality and cost-effectiveness, and DALL-E 3 performed best for anatomical detail and usability. The inter-rater reliability analysis demonstrated a high level of agreement among reviewers (ICC = 0.858, 95% CI). However, all models showed significant flaws in depicting crucial anatomical details such as foramina, suture lines, muscular origins/insertions, and neurovascular structures. These limitations were further characterized by abstract depictions, mixing of layers, shadowing, abnormal muscle arrangements, and labeling errors. Conclusions: These findings highlight GAI's potential for rapidly creating craniofacial anatomy illustrations but also its current limitations due to inadequate training data and incomplete understanding of complex anatomy. Refining these models through precise training data and expert feedback is vital. Ethical considerations, such as potential biases, copyright challenges, and the risks of propagating inaccurate information, must also be carefully navigated. Further refinement of GAI models and ethical safeguards are essential for safe use.
Collapse
Affiliation(s)
- Syed Ali Haider
- Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
| | | | | | - Sahar Borna
- Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
| | | | - Ariana Genovese
- Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
| | - Maissa Trabilsy
- Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
| | - Andrea Galvao
- School of Dental, Unichristus, Fortaleza 60190-180, Brazil
| | - Keith T. Aziz
- Department of Orthopedic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
| | - Peter M. Murray
- Department of Orthopedic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
| | - Yogesh Parte
- Center for Digital Health, Mayo Clinic, Rochester, MN 55905, USA; (Y.P.); (Y.Y.)
| | - Yunguo Yu
- Center for Digital Health, Mayo Clinic, Rochester, MN 55905, USA; (Y.P.); (Y.Y.)
| | - Cui Tao
- Department of AI and Informatics, Mayo Clinic, Jacksonville, FL 32224, USA
| | - Antonio Jorge Forte
- Division of Plastic Surgery, Mayo Clinic, Jacksonville, FL 32224, USA
- Center for Digital Health, Mayo Clinic, Rochester, MN 55905, USA; (Y.P.); (Y.Y.)
| |
Collapse
|
3
|
Senft Everson N, Gaysynsky A, Iles IA, Schrader KE, Chou WYS. What does an AI-generated "cancer survivor" look like? An analysis of images generated by text-to-image tools. J Cancer Surviv 2025:10.1007/s11764-025-01760-1. [PMID: 40025001 DOI: 10.1007/s11764-025-01760-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2025] [Accepted: 02/03/2025] [Indexed: 03/04/2025]
Abstract
PURPOSE Cancer survivorship begins at diagnosis and encompasses a wide variety of experiences, yet prominent societal narratives of survivorship emphasize a positive, post-treatment "return-to-normal." These representations shape how survivorship is understood and experienced by cancer survivors and the public. This study aimed to (1) characterize artificial intelligence (AI)-generated images of cancer survivors and (2) compare them to images of cancer patients to understand how these images might reflect and amplify prevalent survivorship narratives. METHODS Two AI text-to-image tools (DALL-E, Stable Diffusion) were prompted to generate 40 images each of cancer survivors and cancer patients (n = 160 images). Images were coded for perceived demographics, affect, health, markers of illness or cancer, and setting. Chi-square analyses tested differences between images of cancer patients and survivors. Quantitative data were complemented by coders' qualitative insights. RESULTS Cancer survivors in AI-generated images were largely perceived as White (80%), feminine (80%), young (51%), happy (69%), and healthy (80%), and many images were observed to conform to Western beauty ideals. Pink (64%), cancer ribbons (35%), and head scarves (51%) were prominent visual features in survivor images. Compared to images of cancer patients, survivor images more frequently featured individuals perceived as non-White (p = .03), young (p < .001), affectively positive (p < .001), and healthy (p < .001), and less frequently included markers of illness like portraying individuals in bed (p < .001) or in medical settings (p < .001). CONCLUSIONS AI-generated images of cancer survivors fail to reflect the breadth of survivor demographics or experience. IMPLICATIONS FOR CANCER SURVIVORS AI-generated images may perpetuate narrow views of cancer survivorship.
Collapse
Affiliation(s)
- Nicole Senft Everson
- Health Communication and Informatics Research Branch, Behavioral Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, Rockville, MD, USA.
| | - Anna Gaysynsky
- Health Communication and Informatics Research Branch, Behavioral Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, Rockville, MD, USA
- ICF Next, ICF, Rockville, MD, USA
| | - Irina A Iles
- Health Communication and Informatics Research Branch, Behavioral Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, Rockville, MD, USA
| | | | - Wen-Ying Sylvia Chou
- Health Communication and Informatics Research Branch, Behavioral Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, Rockville, MD, USA
| |
Collapse
|
4
|
Gupta N, Khatri K, Malik Y, Lakhani A, Kanwal A, Aggarwal S, Dahuja A. Exploring prospects, hurdles, and road ahead for generative artificial intelligence in orthopedic education and training. BMC MEDICAL EDUCATION 2024; 24:1544. [PMID: 39732679 DOI: 10.1186/s12909-024-06592-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2024] [Accepted: 12/20/2024] [Indexed: 12/30/2024]
Abstract
Generative Artificial Intelligence (AI), characterized by its ability to generate diverse forms of content including text, images, video and audio, has revolutionized many fields, including medical education. Generative AI leverages machine learning to create diverse content, enabling personalized learning, enhancing resource accessibility, and facilitating interactive case studies. This narrative review explores the integration of generative artificial intelligence (AI) into orthopedic education and training, highlighting its potential, current challenges, and future trajectory. A review of recent literature was conducted to evaluate the current applications, identify potential benefits, and outline limitations of integrating generative AI in orthopedic education. Key findings indicate that generative AI holds substantial promise in enhancing orthopedic training through its various applications such as providing real-time explanations, adaptive learning materials tailored to individual student's specific needs, and immersive virtual simulations. However, despite its potential, the integration of generative AI into orthopedic education faces significant issues such as accuracy, bias, inconsistent outputs, ethical and regulatory concerns and the critical need for human oversight. Although generative AI models such as ChatGPT and others have shown impressive capabilities, their current performance on orthopedic exams remains suboptimal, highlighting the need for further development to match the complexity of clinical reasoning and knowledge application. Future research should focus on addressing these challenges through ongoing research, optimizing generative AI models for medical content, exploring best practices for ethical AI usage, curriculum integration and evaluating the long-term impact of these technologies on learning outcomes. By expanding AI's knowledge base, refining its ability to interpret clinical images, and ensuring reliable, unbiased outputs, generative AI holds the potential to revolutionize orthopedic education. This work aims to provides a framework for incorporating generative AI into orthopedic curricula to create a more effective, engaging, and adaptive learning environment for future orthopedic practitioners.
Collapse
Affiliation(s)
- Nikhil Gupta
- Department of Pharmacology, All India Institute of Medical Sciences, Bathinda, Punjab, 151001, India
| | - Kavin Khatri
- Department of Orthopedics, Postgraduate Institute of Medical Education and Research (PGIMER) Satellite Centre, Sangrur, Punjab, 148001, India.
| | - Yogender Malik
- Department of Forensic Medicine and Toxicology, Bhagat Phool Singh Govt Medical College for Women, Khanpur Kalan, Sonepat, Haryana, 131305, India
| | - Amit Lakhani
- Department of Orthopedics, Dr B.R. Ambedkar State Institute of Medical Sciences, Mohali, Punjab, 160055, India
| | - Abhinav Kanwal
- Department of Pharmacology, All India Institute of Medical Sciences, Bathinda, Punjab, 151001, India.
| | - Sameer Aggarwal
- Department of Orthopedics, Postgraduate Institute of Medical Education and Research (PGIMER), Chandigarh, 160012, India
| | - Anshul Dahuja
- Department of Orthopedics, Guru Gobind Singh Medical College and Hospital, Faridkot, Punjab, 151203, India
| |
Collapse
|
5
|
Ogundiya O, Rahman TJ, Valnarov-Boulter I, Young TM. Looking Back on Digital Medical Education Over the Last 25 Years and Looking to the Future: Narrative Review. J Med Internet Res 2024; 26:e60312. [PMID: 39700490 PMCID: PMC11695957 DOI: 10.2196/60312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Revised: 09/05/2024] [Accepted: 12/04/2024] [Indexed: 12/21/2024] Open
Abstract
BACKGROUND The last 25 years have seen enormous progression in digital technologies across the whole of the health service, including health education. The rapid evolution and use of web-based and digital techniques have been significantly transforming this field since the beginning of the new millennium. These advancements continue to progress swiftly, even more so after the COVID-19 pandemic. OBJECTIVE This narrative review aims to outline and discuss the developments that have taken place in digital medical education across the defined time frame. In addition, evidence for potential opportunities and challenges facing digital medical education in the near future was collated for analysis. METHODS Literature reviews were conducted using PubMed, Web of Science Core Collection, Scopus, Google Scholar, and Embase. The participants and learners in this study included medical students, physicians in training or continuing professional development, nurses, paramedics, and patients. RESULTS Evidence of the significant steps in the development of digital medical education in the past 25 years was presented and analyzed in terms of application, impact, and implications for the future. The results were grouped into the following themes for discussion: learning management systems; telemedicine (in digital medical education); mobile health; big data analytics; the metaverse, augmented reality, and virtual reality; the COVID-19 pandemic; artificial intelligence; and ethics and cybersecurity. CONCLUSIONS Major changes and developments in digital medical education have occurred from around the start of the new millennium. Key steps in this journey include technical developments in teleconferencing and learning management systems, along with a marked increase in mobile device use for accessing learning over this time. While the pace of evolution in digital medical education accelerated during the COVID-19 pandemic, further rapid progress has continued since the resolution of the pandemic. Many of these changes are currently being widely used in health education and other fields, such as augmented reality, virtual reality, and artificial intelligence, providing significant future potential. The opportunities these technologies offer must be balanced against the associated challenges in areas such as cybersecurity, the integrity of web-based assessments, ethics, and issues of digital privacy to ensure that digital medical education continues to thrive in the future.
Collapse
Affiliation(s)
| | | | - Ioan Valnarov-Boulter
- Queen Square Institute of Neurology, University College London, London, United Kingdom
| | - Tim Michael Young
- Queen Square Institute of Neurology, University College London, London, United Kingdom
| |
Collapse
|
6
|
Muhr P, Pan Y, Tumescheit C, Kübler AK, Parmaksiz HK, Chen C, Bolaños Orozco PS, Lienkamp SS, Hastings J. Evaluating Text-to-Image Generated Photorealistic Images of Human Anatomy. Cureus 2024; 16:e74193. [PMID: 39712811 PMCID: PMC11663238 DOI: 10.7759/cureus.74193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/21/2024] [Indexed: 12/24/2024] Open
Abstract
BACKGROUND Generative artificial intelligence (AI) models that can produce photorealistic images from text descriptions have many applications in medicine, including medical education and the generation of synthetic data. However, it can be challenging to evaluate their heterogeneous outputs and to compare between different models. There is a need for a systematic approach enabling image and model comparisons. METHOD To address this gap, we developed an error classification system for annotating errors in AI-generated photorealistic images of humans and applied our method to a corpus of 240 images generated with three different models (DALL-E 3, Stable Diffusion XL, and Stable Cascade) using 10 prompts with eight images per prompt. RESULTS The error classification system identifies five different error types with three different severities across five anatomical regions and specifies an associated quantitative scoring method based on aggregated proportions of errors per expected count of anatomical components for the generated image. We assessed inter-rater agreement by double-annotating 25% of the images and calculating Krippendorf's alpha and compared results across the three models and 10 prompts quantitatively using a cumulative score per image. The error classification system, accompanying training manual, generated image collection, annotations, and all associated scripts, is available from our GitHub repository at https://github.com/hastingslab-org/ai-human-images. Inter-rater agreement was relatively poor, reflecting the subjectivity of the error classification task. Model comparisons revealed that DALL-E 3 performed consistently better than Stable Diffusion; however, the latter generated images reflecting more diversity in personal attributes. Images with groups of people were more challenging for all the models than individuals or pairs; some prompts were challenging for all models. CONCLUSION Our method enables systematic comparison of AI-generated photorealistic images of humans; our results can serve to catalyse improvements in these models for medical applications.
Collapse
Affiliation(s)
- Paula Muhr
- Faculty of Medicine, Institute for Implementation Science in Health Care, University of Zurich, Zurich, CHE
| | - Yating Pan
- Digital Society Initiative, University of Zurich, Zurich, CHE
| | - Charlotte Tumescheit
- Faculty of Medicine, Institute for Implementation Science in Health Care, University of Zurich, Zurich, CHE
| | | | | | - Cheng Chen
- Digital Society Initiative, University of Zurich, Zurich, CHE
| | | | - Soeren S Lienkamp
- Faculty of Medicine, Institute for Anatomy, University of Zurich, Zurich, CHE
| | - Janna Hastings
- Faculty of Medicine, Institute for Implementation Science in Health Care, University of Zurich, Zurich, CHE
| |
Collapse
|
7
|
Moin KA, Nasir AA, Petroff DJ, Loveless BA, Moshirfar OA, Hoopes PC, Moshirfar M. Assessment of Generative Artificial Intelligence (AI) Models in Creating Medical Illustrations for Various Corneal Transplant Procedures. Cureus 2024; 16:e67833. [PMID: 39328681 PMCID: PMC11424388 DOI: 10.7759/cureus.67833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/26/2024] [Indexed: 09/28/2024] Open
Abstract
PURPOSE This study aimed to task and assess generative artificial intelligence (AI) models in creating medical illustrations for corneal transplant procedures such as Descemet's stripping automated endothelial keratoplasty (DSAEK), Descemet's membrane endothelial keratoplasty (DMEK), deep anterior lamellar keratoplasty (DALK), and penetrating keratoplasty (PKP). Methods: Six engineered prompts were provided to Decoder-Only Autoregressive Language and Image Synthesis 3 (DALL-E 3) and Medical Illustration Manager (MIM) to guide these generative AI models in creating a final medical illustration for each of the four corneal transplant procedures. Control illustrations were created by the authors for each transplant technique for comparison. A grading system with five categories with a maximum score of 3 points each (15 points total) was designed to objectively assess AI's performance. Four independent reviewers analyzed and scored the final images produced by DALL-E 3 and MIM as well as the control illustrations. All AI-generated images and control illustrations were then provided to Chat Generative Pre-Trained Transformer-4o (ChatGPT-4o), which was tasked with grading each image with the grading system described above. All results were then tabulated and graphically depicted. RESULTS The control illustration images received significantly higher scores than produced images from DALL-E 3 and MIM in legibility, anatomical realism and accuracy, procedural step accuracy, and lack of fictitious anatomy (p<0.001). For detail and clarity, the control illustrations and images produced by DALL-E 3 and MIM received statistically similar scores of 2.75±0.29, 2.19±0.24, and 2.50±0.29, respectively (p=0.0504). With regard to mean cumulative scores for each transplant procedure image, the control illustrations received a significantly higher score than DALL-E 3 and MIM (p<0.001). Additionally, the overall mean cumulative score for the control illustrations was significantly higher than DALL-E 3 and MIM (14.56±0.51 (97.1%), 4.38±1.2 (29.2%), and 5.63±1.82 (37.5%), respectively (p<0.001)). When assessing AI's grading performance, ChatGPT-4o scored the images produced by DALL-E 3 and MIM significantly higher than the average scores of the independent reviewers (DALL-E 3: 10.0±0.0 (66.6%) vs. 4.38±1.20 (29.2%), p<0.001; MIM: 10.0±0.0 (66.6%) vs. 5.63±1.82 (37.5%), p<0.001). However, mean scores for the control illustrations between ChatGPT-4o and the independent reviewers were comparable (15.0±0.0 (100%) vs. 14.56±0.13 (97.1%); p>0.05). CONCLUSION AI is an extremely powerful and efficient tool for many tasks, but it is currently limited in producing accurate medical illustrations for corneal transplant procedures. Further development is required for generative AI models to create medically sound and accurate illustrations for use in ophthalmology.
Collapse
Affiliation(s)
- Kayvon A Moin
- Hoopes Vision Research Center, Hoopes Vision, Draper, USA
- School of Medicine, American University of the Caribbean, Cupecoy, SXM
| | - Ayesha A Nasir
- Ophthalmology, University of Louisville School of Medicine, Louisville, USA
| | - Dallas J Petroff
- Ophthalmology, Idaho College of Osteopathic Medicine, Meridian, USA
| | - Bosten A Loveless
- Ophthalmology, Rocky Vista University College of Osteopathic Medicine, Ivins, USA
- Hoopes Vision Research Center, Hoopes Vision, Draper, USA
| | - Omeed A Moshirfar
- Sam Fox School of Design and Visual Art, Washington University in St. Louis, St. Louis, USA
| | | | - Majid Moshirfar
- Hoopes Vision Research Center, Hoopes Vision, Draper, USA
- John A. Moran Eye Center, University of Utah School of Medicine, Salt Lake City, USA
- Eye Banking and Corneal Transplantation, Utah Lions Eye Bank, Murray, USA
| |
Collapse
|
8
|
Petroff DJ, Nasir AA, Moin KA, Loveless BA, Moshirfar OA, Hoopes PC, Moshirfar M. Evaluating the Accuracy of Artificial Intelligence (AI)-Generated Illustrations for Laser-Assisted In Situ Keratomileusis (LASIK), Photorefractive Keratectomy (PRK), and Small Incision Lenticule Extraction (SMILE). Cureus 2024; 16:e67747. [PMID: 39318903 PMCID: PMC11421855 DOI: 10.7759/cureus.67747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/23/2024] [Indexed: 09/26/2024] Open
Abstract
PURPOSE To utilize artificial intelligence (AI) platforms to generate medical illustrations for refractive surgeries, aiding patients in visualizing and comprehending procedures like laser-assisted in situ keratomileusis (LASIK), photorefractive keratectomy (PRK), and small incision lenticule extraction (SMILE). This study displays the current performance of two OpenAI programs in terms of their accuracy in common corneal refractive procedures. METHODS We selected AI image generators based on their popularity, choosing Decoder-Only Autoregressive Language and Image Synthesis 3 (DALL-E 3) for its leading position and Medical Illustration Master (MiM) for its high engagement. We developed six non-AI-generated prompts targeting specific outcomes related to LASIK, PRK, and SMILE procedures to assess medical accuracy. We generated images using these prompts (18 total images per AI platform) and used the final images produced after the sixth prompt for this study (three final images per AI platform). Human-created procedural images were also gathered for comparison. Four experts independently graded the images, and their scores were averaged. Each image was evaluated with our grading system on "Legibility," "Detail & Clarity," "Anatomical Realism & Accuracy," "Procedural Step Accuracy," and "Lack of Fictitious Anatomy," with scores ranging from 0 to 3 per category allowing 15 points total. A score of 15 points signifies excellent performance, indicating a highly accurate medical illustration. Conversely, a low score suggests a poor-quality illustration. Additionally, we submitted the same AI-generated images back into Chat Generative Pre-Trained Transformer-4o (ChatGPT-4o) along with our grading system. This allowed ChatGPT-4o to use and evaluate both AI-generated and human-created images (HCIs). RESULTS In individual category scoring, HCIs significantly outperformed AI images in legibility, anatomical realism, procedural step accuracy, and lack of fictitious anatomy. There were no significant differences between DALL-E 3 and MiM in these categories (p>0.05). In procedure-specific comparisons, HCIs consistently scored higher than AI-generated images for LASIK, PRK, and SMILE. For LASIK, HCIs scored 14 ± 0.82 (93.3%), while DALL-E 3 scored 4.5 ± 0.58 (30%) and MiM scored 4.5 ± 1.91 (30%) (p<0.001). For PRK, HCIs scored 14.5 ± 0.58 (96.7%), compared to DALL-E 3's 5.25 ± 1.26 (35%) and MiM's 7 ± 3.56 (46.7%) (p<0.001). For SMILE, HCIs scored 14.5 ± 0.68 (96.7%), while DALL-E 3 scored 5 ± 0.82 (33.3%) and MiM scored 6 ± 2.71 (40%) (p<0.001). HCIs significantly outperformed AI-generated images from DALL-E 3 and MiM in overall accuracy for medical illustrations, achieving scores of 14.33 ± 0.23 (95.6%), 4.93 ± 0.69 (32.8%), and 5.83 ± 0.23 (38.9%) respectively (p<0.001). ChatGPT-4o evaluations were consistent with human evaluations for HCIs (3 ± 0, 2.87 ± 0.23; p=0.121) but rated AI images higher than human evaluators (2 ± 0 vs 1.07 ± 0.73; p<0.001). CONCLUSION This study highlights the inaccuracy of AI-generated images in illustrating corneal refractive procedures such as LASIK, PRK, and SMILE. Although the OpenAI platform can create images recognizable as eyes, they lack educational value. AI excels in quickly generating creative, vibrant images, but accurate medical illustration remains a significant challenge. While AI performs well with text-based actions, its capability to produce precise medical images needs substantial improvement.
Collapse
Affiliation(s)
- Dallas J Petroff
- Ophthalmology, Idaho College of Osteopathic Medicine, Meridian, USA
| | - Ayesha A Nasir
- Ophthalmology, University of Louisville School of Medicine, Louisville, USA
| | - Kayvon A Moin
- Ophthalmology, Hoopes Vision Research Center, Hoopes Vision, Draper, USA
- Medicine, American University of the Caribbean, Cupecoy, SXM
| | - Bosten A Loveless
- Ophthalmology, Rocky Vista University College of Osteopathic Medicine, Ivins, USA
- Ophthalmology, Hoopes Vision Research Center, Hoopes Vision, Draper, USA
| | - Omeed A Moshirfar
- Sam Fox School of Design and Visual Art, Washington University in St. Louis, St. Louis, USA
| | - Phillip C Hoopes
- Ophthalmology, Hoopes Vision Research Center, Hoopes Vision, Draper, USA
| | - Majid Moshirfar
- Ophthalmology, Hoopes Vision Research Center, Hoopes Vision, Draper, USA
- Ophthalmology, John A. Moran Eye Center, University of Utah School of Medicine, Salt Lake City, USA
- Eye Banking and Corneal Transplantation, Utah Lions Eye Bank, Murray, USA
| |
Collapse
|
9
|
Ajmera P, Nischal N, Ariyaratne S, Botchu B, Bhamidipaty K, Iyengar KP, Ajmera SR, Jenko N, Botchu R. Response to: ChatGPT's limited accuracy in generating anatomical images for medical. Skeletal Radiol 2024; 53:1597. [PMID: 38506965 DOI: 10.1007/s00256-024-04656-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/22/2024]
Affiliation(s)
- P Ajmera
- Department of Radiology, Mayo Clinic, Rochester, MN, USA
| | - N Nischal
- Department of Radiology, Holy Family Hospital, New Delhi, India
| | - S Ariyaratne
- Department of Musculoskeletal Radiology, Royal Orthopedic Hospital, Birmingham, UK
| | | | | | - K P Iyengar
- Department of Orthopedics, Southport and Ormskirk Hospital, Mersey and West Lancashire, NHS Trust, Southport, UK
| | - S R Ajmera
- Department of Radiology, Mayo Clinic, Rochester, MN, USA
| | - N Jenko
- Department of Musculoskeletal Radiology, Royal Orthopedic Hospital, Birmingham, UK
| | - R Botchu
- Department of Musculoskeletal Radiology, Royal Orthopedic Hospital, Birmingham, UK.
| |
Collapse
|
10
|
Klug J, Pietsch U. Can artificial intelligence help for scientific illustration? Details matter. Crit Care 2024; 28:196. [PMID: 38858791 PMCID: PMC11165878 DOI: 10.1186/s13054-024-04970-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Accepted: 05/24/2024] [Indexed: 06/12/2024] Open
Affiliation(s)
- Julian Klug
- Division of Perioperative Intensive Care Medicine, Cantonal Hospital St. Gallen, Rorschacher Strasse 95, 9007, St. Gallen, Switzerland.
- Stroke Research Group, Department of Clinical Neurosciences, Faculty of Medicine, University Hospital, Geneva, Switzerland.
| | - Urs Pietsch
- Division of Perioperative Intensive Care Medicine, Cantonal Hospital St. Gallen, Rorschacher Strasse 95, 9007, St. Gallen, Switzerland
- Department of Emergency Medicine, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
| |
Collapse
|