1
|
Radha K, Karuna Y. Latent space autoencoder generative adversarial model for retinal image synthesis and vessel segmentation. BMC Med Imaging 2025; 25:149. [PMID: 40325399 DOI: 10.1186/s12880-025-01694-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2024] [Accepted: 04/25/2025] [Indexed: 05/07/2025] Open
Abstract
Diabetes is a widespread condition that can lead to serious vision problems over time. Timely identification and treatment of diabetic retinopathy (DR) depend on accurately segmenting retinal vessels, which can be achieved through the invasive technique of fundus imaging. This methodology facilitates the systematic monitoring and assessment of the progression of DR. In recent years, deep learning has made significant steps in various fields, including medical image processing. Numerous algorithms have been developed for segmenting retinal vessels in fundus images, demonstrating excellent performance. However, it is widely recognized that large datasets are essential for training deep learning models to ensure they can generalize well. A major challenge in retinal vessel segmentation is the lack of ground truth samples to train these models. To overcome this, we aim to generate synthetic data. This work draws inspiration from recent advancements in generative adversarial networks (GANs). Our goal is to generate multiple realistic retinal fundus images based on tubular structured annotations while simultaneously creating binary masks from the retinal fundus images. We have integrated a latent space auto-encoder to maintain the vessel morphology when generating RGB fundus images and mask images. This approach can synthesize diverse images from a single tubular structured annotation and generate various tubular structures from a single fundus image. To test our method, we utilized three primary datasets, DRIVE, STARE, and CHASE_DB, to generate synthetic data. We then trained and tested a simple UNet model for segmentation using this synthetic data and compared its performance against the standard dataset. The results indicated that the synthetic data offered excellent segmentation performance, a crucial aspect in medical image analysis, where smaller datasets are often common. This demonstrates the potential of synthetic data as a valuable resource for training segmentation and classification models for disease diagnosis. Overall, we used the DRIVE, STARE, and CHASE_DB datasets to synthesize and evaluate the proposed image-to-image translation approach and its segmentation effectiveness.
Collapse
Affiliation(s)
- K Radha
- School of Electronics Engineering, Vellore Institute of Technology, Vellore, India
| | - Yepuganti Karuna
- School of Electronics Engineering, VIT-AP University, Amaravati, India.
| |
Collapse
|
2
|
Pandey PU, Micieli JA, Ong Tone S, Eng KT, Kertes PJ, Wong JCY. Realistic fundus photograph generation for improving automated disease classification. Br J Ophthalmol 2025:bjo-2024-326122. [PMID: 39939121 DOI: 10.1136/bjo-2024-326122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2024] [Accepted: 01/09/2025] [Indexed: 02/14/2025]
Abstract
AIMS This study aims to investigate whether denoising diffusion probabilistic models (DDPMs) could generate realistic retinal images, and if they could be used to improve the performance of a deep convolutional neural network (CNN) ensemble for multiple retinal disease classification, which was previously shown to outperform human experts. METHODS We trained DDPMs to generate retinal fundus images representing diabetic retinopathy, age-related macular degeneration, glaucoma or normal eyes. Eight board-certified ophthalmologists evaluated 96 test images to assess the realism of generated images and classified them based on disease labels. Subsequently, between 100 and 1000 generated images were employed to augment training of deep convolutional ensembles for classifying retinal disease. We measured the accuracy of ophthalmologists in correctly identifying real and generated images. We also measured the classification accuracy, F-score and area under the receiver operating curve of a trained CNN in classifying retinal diseases from a test set of 100 fundus images. RESULTS Ophthalmologists exhibited a mean accuracy of 61.1% (range: 51.0%-68.8%) in differentiating real and generated images. Augmenting the training set with 238 generated images in the smallest class statistically significantly improved the F-score and accuracy by 5.3% and 5.8%, respectively (p<0.01) in a retinal disease classification task, compared with a baseline model trained only with real images. CONCLUSIONS Latent diffusion models generated highly realistic retinal images, as validated by human experts. Adding generated images to the training set improved performance of a CNN ensemble without requiring additional real patient data.
Collapse
Affiliation(s)
- Prashant U Pandey
- School of Biomedical Engineering, The University of British Columbia, Vancouver, British Columbia, Canada
| | - Jonathan A Micieli
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
- Kensington Vision and Research Centre and Kensington Research Institute, Toronto, Ontario, Canada
- Department of Ophthalmology, St. Michael's Hospital, Unity Health, Toronto, Ontario, Canada
| | - Stephan Ong Tone
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
- Kensington Vision and Research Centre and Kensington Research Institute, Toronto, Ontario, Canada
- Department of Ophthalmology and Vision Sciences, Sunnybrook Research Institute, Toronto, Ontario, Canada
- John and Liz Tory Eye Centre, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada
| | - Kenneth T Eng
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
- John and Liz Tory Eye Centre, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada
| | - Peter J Kertes
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
- Kensington Vision and Research Centre and Kensington Research Institute, Toronto, Ontario, Canada
- John and Liz Tory Eye Centre, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada
| | - Jovi C Y Wong
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
3
|
Phipps B, Hadoux X, Sheng B, Campbell JP, Liu TYA, Keane PA, Cheung CY, Chung TY, Wong TY, van Wijngaarden P. AI image generation technology in ophthalmology: Use, misuse and future applications. Prog Retin Eye Res 2025; 106:101353. [PMID: 40107410 DOI: 10.1016/j.preteyeres.2025.101353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Revised: 03/12/2025] [Accepted: 03/13/2025] [Indexed: 03/22/2025]
Abstract
BACKGROUND AI-powered image generation technology holds the potential to reshape medical practice, yet it remains an unfamiliar technology for both medical researchers and clinicians alike. Given the adoption of this technology relies on clinician understanding and acceptance, we sought to demystify its use in ophthalmology. To this end, we present a literature review on image generation technology in ophthalmology, examining both its theoretical applications and future role in clinical practice. METHODS First, we consider the key model designs used for image synthesis, including generative adversarial networks, autoencoders, and diffusion models. We then perform a survey of the literature for image generation technology in ophthalmology prior to September 2024, presenting both the type of model used and its clinical application. Finally, we discuss the limitations of this technology, the risks of its misuse and the future directions of research in this field. RESULTS Applications of this technology include improving AI diagnostic models, inter-modality image transformation, more accurate treatment and disease prognostication, image denoising, and individualised education. Key barriers to its adoption include bias in generative models, risks to patient data security, computational and logistical barriers to development, challenges with model explainability, inconsistent use of validation metrics between studies and misuse of synthetic images. Looking forward, researchers are placing a further emphasis on clinically grounded metrics, the development of image generation foundation models and the implementation of methods to ensure data provenance. CONCLUSION Compared to other medical applications of AI, image generation is still in its infancy. Yet, it holds the potential to revolutionise ophthalmology across research, education and clinical practice. This review aims to guide ophthalmic researchers wanting to leverage this technology, while also providing an insight for clinicians on how it may change ophthalmic practice in the future.
Collapse
Affiliation(s)
- Benjamin Phipps
- Centre for Eye Research Australia, Royal Victorian Eye and Ear Hospital, East Melbourne, 3002, VIC, Australia; Ophthalmology, Department of Surgery, University of Melbourne, Parkville, 3010, VIC, Australia.
| | - Xavier Hadoux
- Centre for Eye Research Australia, Royal Victorian Eye and Ear Hospital, East Melbourne, 3002, VIC, Australia; Ophthalmology, Department of Surgery, University of Melbourne, Parkville, 3010, VIC, Australia
| | - Bin Sheng
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - J Peter Campbell
- Department of Ophthalmology, Casey Eye Institute, Oregon Health and Science University, Portland, USA
| | - T Y Alvin Liu
- Retina Division, Wilmer Eye Institute, Johns Hopkins University, Baltimore, MD, 21287, USA
| | - Pearse A Keane
- NIHR Biomedical Research Centre at Moorfields Eye Hospital NHS Foundation Trust, London, UK; Institute of Ophthalmology, University College London, London, UK
| | - Carol Y Cheung
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, 999077, China
| | - Tham Yih Chung
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Centre for Innovation and Precision Eye Health, Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore and National University Health System, Singapore; Singapore Eye Research Institute, Singapore National Eye Centre, Singapore; Eye Academic Clinical Program (Eye ACP), Duke NUS Medical School, Singapore
| | - Tien Y Wong
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore; Tsinghua Medicine, Tsinghua University, Beijing, China; Beijing Visual Science and Translational Eye Research Institute, Beijing Tsinghua Changgung Hospital, Beijing, China
| | - Peter van Wijngaarden
- Centre for Eye Research Australia, Royal Victorian Eye and Ear Hospital, East Melbourne, 3002, VIC, Australia; Ophthalmology, Department of Surgery, University of Melbourne, Parkville, 3010, VIC, Australia; Florey Institute of Neuroscience & Mental Health, Parkville, VIC, Australia
| |
Collapse
|
4
|
Zhu Z, Wang Y, Qi Z, Hu W, Zhang X, Wagner SK, Wang Y, Ran AR, Ong J, Waisberg E, Masalkhi M, Suh A, Tham YC, Cheung CY, Yang X, Yu H, Ge Z, Wang W, Sheng B, Liu Y, Lee AG, Denniston AK, Wijngaarden PV, Keane PA, Cheng CY, He M, Wong TY. Oculomics: Current concepts and evidence. Prog Retin Eye Res 2025; 106:101350. [PMID: 40049544 DOI: 10.1016/j.preteyeres.2025.101350] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2024] [Revised: 03/03/2025] [Accepted: 03/03/2025] [Indexed: 03/20/2025]
Abstract
The eye provides novel insights into general health, as well as pathogenesis and development of systemic diseases. In the past decade, growing evidence has demonstrated that the eye's structure and function mirror multiple systemic health conditions, especially in cardiovascular diseases, neurodegenerative disorders, and kidney impairments. This has given rise to the field of oculomics-the application of ophthalmic biomarkers to understand mechanisms, detect and predict disease. The development of this field has been accelerated by three major advances: 1) the availability and widespread clinical adoption of high-resolution and non-invasive ophthalmic imaging ("hardware"); 2) the availability of large studies to interrogate associations ("big data"); 3) the development of novel analytical methods, including artificial intelligence (AI) ("software"). Oculomics offers an opportunity to enhance our understanding of the interplay between the eye and the body, while supporting development of innovative diagnostic, prognostic, and therapeutic tools. These advances have been further accelerated by developments in AI, coupled with large-scale linkage datasets linking ocular imaging data with systemic health data. Oculomics also enables the detection, screening, diagnosis, and monitoring of many systemic health conditions. Furthermore, oculomics with AI allows prediction of the risk of systemic diseases, enabling risk stratification, opening up new avenues for prevention or individualized risk prediction and prevention, facilitating personalized medicine. In this review, we summarise current concepts and evidence in the field of oculomics, highlighting the progress that has been made, remaining challenges, and the opportunities for future research.
Collapse
Affiliation(s)
- Zhuoting Zhu
- Centre for Eye Research Australia, Ophthalmology, University of Melbourne, Melbourne, VIC, Australia; Department of Surgery (Ophthalmology), University of Melbourne, Melbourne, VIC, Australia.
| | - Yueye Wang
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong, China
| | - Ziyi Qi
- Centre for Eye Research Australia, Ophthalmology, University of Melbourne, Melbourne, VIC, Australia; Department of Surgery (Ophthalmology), University of Melbourne, Melbourne, VIC, Australia; Department of Ophthalmology, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, National Clinical Research Center for Eye Diseases, Shanghai, China
| | - Wenyi Hu
- Centre for Eye Research Australia, Ophthalmology, University of Melbourne, Melbourne, VIC, Australia; Department of Surgery (Ophthalmology), University of Melbourne, Melbourne, VIC, Australia
| | - Xiayin Zhang
- Guangdong Eye Institute, Department of Ophthalmology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, China
| | - Siegfried K Wagner
- NIHR Biomedical Research Centre, Moorfields Eye Hospital NHS Foundation Trust, London, UK; Institute of Ophthalmology, University College London, London, UK
| | - Yujie Wang
- Centre for Eye Research Australia, Ophthalmology, University of Melbourne, Melbourne, VIC, Australia; Department of Surgery (Ophthalmology), University of Melbourne, Melbourne, VIC, Australia
| | - An Ran Ran
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong, China
| | - Joshua Ong
- Department of Ophthalmology and Visual Sciences, University of Michigan Kellogg Eye Center, Ann Arbor, USA
| | - Ethan Waisberg
- Department of Clinical Neurosciences, University of Cambridge, Cambridge, UK
| | - Mouayad Masalkhi
- University College Dublin School of Medicine, Belfield, Dublin, Ireland
| | - Alex Suh
- Tulane University School of Medicine, New Orleans, LA, USA
| | - Yih Chung Tham
- Department of Ophthalmology and Centre for Innovation and Precision Eye Health, Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Singapore Eye Research Institute, Singapore National Eye Centre, Singapore; Ophthalmology and Visual Sciences Academic Clinical Program, Duke-NUS Medical School, Singapore
| | - Carol Y Cheung
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong, China
| | - Xiaohong Yang
- Guangdong Eye Institute, Department of Ophthalmology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, China
| | - Honghua Yu
- Guangdong Eye Institute, Department of Ophthalmology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, China
| | - Zongyuan Ge
- Monash e-Research Center, Faculty of Engineering, Airdoc Research, Nvidia AI Technology Research Center, Monash University, Melbourne, VIC, Australia
| | - Wei Wang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Bin Sheng
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Yun Liu
- Google Research, Mountain View, CA, USA
| | - Andrew G Lee
- Center for Space Medicine and the Department of Ophthalmology, Baylor College of Medicine, Houston, USA; Department of Ophthalmology, Blanton Eye Institute, Houston Methodist Hospital, Houston, USA; The Houston Methodist Research Institute, Houston Methodist Hospital, Houston, USA; Departments of Ophthalmology, Neurology, and Neurosurgery, Weill Cornell Medicine, New York, USA; Department of Ophthalmology, University of Texas Medical Branch, Galveston, USA; University of Texas MD Anderson Cancer Center, Houston, USA; Texas A&M College of Medicine, Bryan, USA; Department of Ophthalmology, The University of Iowa Hospitals and Clinics, Iowa City, USA
| | - Alastair K Denniston
- NIHR Biomedical Research Centre, Moorfields Eye Hospital NHS Foundation Trust, London, UK; Institute of Ophthalmology, University College London, London, UK; National Institute for Health and Care Research (NIHR) Birmingham Biomedical Research Centre (BRC), University Hospital Birmingham and University of Birmingham, Birmingham, UK; University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK; Institute of Inflammation and Ageing, University of Birmingham, Birmingham, UK; Birmingham Health Partners Centre for Regulatory Science and Innovation, University of Birmingham, Birmingham, UK
| | - Peter van Wijngaarden
- Centre for Eye Research Australia, Ophthalmology, University of Melbourne, Melbourne, VIC, Australia; Department of Surgery (Ophthalmology), University of Melbourne, Melbourne, VIC, Australia; Florey Institute of Neuroscience and Mental Health, University of Melbourne, Parkville, VIC, Australia
| | - Pearse A Keane
- NIHR Biomedical Research Centre, Moorfields Eye Hospital NHS Foundation Trust, London, UK; Institute of Ophthalmology, University College London, London, UK
| | - Ching-Yu Cheng
- Department of Ophthalmology and Centre for Innovation and Precision Eye Health, Yong Loo Lin School of Medicine, National University of Singapore, Singapore; Singapore Eye Research Institute, Singapore National Eye Centre, Singapore; Ophthalmology and Visual Sciences Academic Clinical Program, Duke-NUS Medical School, Singapore
| | - Mingguang He
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong, China; Research Centre for SHARP Vision (RCSV), The Hong Kong Polytechnic University, Kowloon, Hong Kong, China; Centre for Eye and Vision Research (CEVR), 17W Hong Kong Science Park, Hong Kong, China
| | - Tien Yin Wong
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore; School of Clinical Medicine, Beijing Tsinghua Changgung Hospital, Tsinghua Medicine, Tsinghua University, Beijing, China.
| |
Collapse
|
5
|
D N S, Pai RM, Bhat SN, Pai M M M. Assessment of perceived realism in AI-generated synthetic spine fracture CT images. Technol Health Care 2025; 33:931-944. [PMID: 40105176 DOI: 10.1177/09287329241291368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/20/2025]
Abstract
BackgroundDeep learning-based decision support systems require synthetic images generated by adversarial networks, which require clinical evaluation to ensure their quality.ObjectiveThe study evaluates perceived realism of high-dimension synthetic spine fracture CT images generated Progressive Growing Generative Adversarial Networks (PGGANs).Method: The study used 2820 spine fracture CT images from 456 patients to train an PGGAN model. The model synthesized images up to 512 × 512 pixels, and the realism of the generated images was assessed using Visual Turing Tests and Fracture Identification Test. Three spine surgeons evaluated the images, and clinical evaluation results were statistically analysed.Result: Spine surgeons have an average prediction accuracy of nearly 50% during clinical evaluations, indicating difficulty in distinguishing between real and generated images. The accuracy varies for different dimensions, with synthetic images being more realistic, especially in 512 × 512-dimension images. During FIT, among 16 generated images of each fracture type, 13-15 images were correctly identified, indicating images are more realistic and clearly depict fracture lines in 512 × 512 dimensions.ConclusionThe study reveals that AI-based PGGAN can generate realistic synthetic spine fracture CT images up to 512 × 512 pixels, making them difficult to distinguish from real images, and improving the automatic spine fracture type detection system.
Collapse
Affiliation(s)
- Sindhura D N
- Department of Data Science and Computer Applications, Manipal Institute of Technology, Manipal, Manipal Academy of Higher Education, Manipal, India
| | - Radhika M Pai
- Department of Data Science and Computer Applications, Manipal Institute of Technology, Manipal, Manipal Academy of Higher Education, Manipal, India
| | - Shyamasunder N Bhat
- Department of Orthopaedics, Kasturba Medical College, Manipal, Manipal Academy of Higher Education, Manipal, India
| | - Manohara Pai M M
- Department of Information and Communication Technology, Manipal Institute of Technology, Manipal, Manipal Academy of Higher Education, Manipal, India
| |
Collapse
|
6
|
Fang L, Sheng H, Li H, Li S, Feng S, Chen M, Li Y, Chen J, Chen F. Unsupervised translation of vascular masks to NIR-II fluorescence images using Attention-Guided generative adversarial networks. Sci Rep 2025; 15:6725. [PMID: 40000690 PMCID: PMC11861915 DOI: 10.1038/s41598-025-91416-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2024] [Accepted: 02/20/2025] [Indexed: 02/27/2025] Open
Abstract
The second near-infrared window (NIR-II) fluorescence imaging is a crucial technology for investigating the structure and functionality of blood vessels. However, challenges arise from privacy concerns and the significant effort needed for data annotation, complicating the acquisition of near-infrared vascular imaging datasets. To tackle these issues, methods based on deep learning for data synthesis have demonstrated promise in generating high-quality synthetic images. In this paper, we propose an unsupervised generative adversarial network (GAN) approach for translating vascular masks into realistic NIR-II fluorescence vascular images. Leveraging an attention mechanism integrated into the loss function, our model focuses on essential features during the generation process, resulting in high-quality NIRII images without the need for supervision. Our method significantly outperforms eight baseline techniques in both visual quality and quantitative metrics, demonstrating its potential to address the challenge of limited datasets in NIR-II medical imaging. This work not only enhances the applications of NIR-II imaging but also facilitates downstream tasks by providing abundant, high-fidelity synthetic data.
Collapse
Affiliation(s)
- Lu Fang
- Chinese Academy of Sciences, Shanghai Institute of Technical Physics, Shanghai, 200083, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Huaixuan Sheng
- Sports Medicine Institute of Fudan University, Department of Sports Medicine, Huashan Hospital, Fudan University, Shanghai, 200040, China
| | - Huizhu Li
- Sports Medicine Institute of Fudan University, Department of Sports Medicine, Huashan Hospital, Fudan University, Shanghai, 200040, China
| | - Shunyao Li
- Sports Medicine Institute of Fudan University, Department of Sports Medicine, Huashan Hospital, Fudan University, Shanghai, 200040, China
| | - Sijia Feng
- Sports Medicine Institute of Fudan University, Department of Sports Medicine, Huashan Hospital, Fudan University, Shanghai, 200040, China
| | - Mo Chen
- Department of Bone and Joint Surgery, Department of Orthopedics, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, 200001, China
| | - Yunxia Li
- Sports Medicine Institute of Fudan University, Department of Sports Medicine, Huashan Hospital, Fudan University, Shanghai, 200040, China
| | - Jun Chen
- Sports Medicine Institute of Fudan University, Department of Sports Medicine, Huashan Hospital, Fudan University, Shanghai, 200040, China
| | - Fuchun Chen
- Chinese Academy of Sciences, Shanghai Institute of Technical Physics, Shanghai, 200083, China.
| |
Collapse
|
7
|
Peng K, Huang D, Chen Y. Retinal OCT image classification based on MGR-GAN. Med Biol Eng Comput 2025:10.1007/s11517-025-03286-1. [PMID: 39862318 DOI: 10.1007/s11517-025-03286-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Accepted: 12/31/2024] [Indexed: 01/27/2025]
Abstract
Accurately classifying optical coherence tomography (OCT) images is essential for diagnosing and treating ophthalmic diseases. This paper introduces a novel generative adversarial network framework called MGR-GAN. The masked image modeling (MIM) method is integrated into the GAN model's generator, enhancing its ability to synthesize more realistic images by reconstructing them based on unmasked patches. A ResNet-structured discriminator is employed to determine whether the image is generated by the generator. Through the unique game process of the generative adversarial network (GAN) model, the discriminator acquires high-level discriminant features, essential for precise OCT classification. Experimental results demonstrate that MGR-GAN achieves a classification accuracy of 98.4% on the original UCSD dataset. As the trained generator can synthesize OCT images with higher precision, and owing to category imbalances in the UCSD dataset, the generated OCT images are leveraged to address this imbalance. After balancing the UCSD dataset, the classification accuracy further improves to 99%.
Collapse
Affiliation(s)
- Kun Peng
- School of Automation and Information Engineering, Sichuan University of Science & Engineering, Key Laboratory of Artificial Intelligence, Yibin, 644000, Sichuan, China
| | - Dan Huang
- School of Automation and Information Engineering, Sichuan University of Science & Engineering, Key Laboratory of Artificial Intelligence, Yibin, 644000, Sichuan, China.
| | - Yurong Chen
- School of Automation and Information Engineering, Sichuan University of Science & Engineering, Key Laboratory of Artificial Intelligence, Yibin, 644000, Sichuan, China
| |
Collapse
|
8
|
Khan AR, Javed R, Sadad T, Bahaj SA, Sampedro GA, Abisado M. Early pigment spot segmentation and classification from iris cellular image analysis with explainable deep learning and multiclass support vector machine. Biochem Cell Biol 2025; 103:1-10. [PMID: 37906957 DOI: 10.1139/bcb-2023-0183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2023] Open
Abstract
Globally, retinal disorders impact thousands of individuals. Early diagnosis and treatment of these anomalies might halt their development and prevent many people from developing preventable blindness. Iris spot segmentation is critical due to acquiring iris cellular images that suffer from the off-angle iris, noise, and specular reflection. Most currently used iris segmentation techniques are based on edge data and noncellular images. The size of the pigment patches on the surface of the iris increases with eye syndrome. In addition, iris images taken in uncooperative settings frequently have negative noise, making it difficult to segment them precisely. The traditional diagnosis processes are costly and time consuming since they require highly qualified personnel and have strict environments. This paper presents an explainable deep learning model integrated with a multiclass support vector machine to analyze iris cellular images for early pigment spot segmentation and classification. Three benchmark datasets MILE, UPOL, and Eyes SUB were used in the experiments to test the proposed methodology. The experimental results are compared on standard metrics, demonstrating that the proposed model outperformed the methods reported in the literature regarding classification errors. Additionally, it is observed that the proposed parameters are highly effective in locating the micro pigment spots on the iris surfaces.
Collapse
Affiliation(s)
- Amjad R Khan
- Department of Information Systems, Prince Sultan University, Riyadh 66833, Saudi Arabia
| | - Rabia Javed
- Department of Computer Science, Lahore College for Women University, Lahore, Pakistan
| | - Tariq Sadad
- Department of Computer Science, University of Engineering and Technology, Mardan, Pakistan
| | - Saeed Ali Bahaj
- MIS Department, College of Business Administration, Prince Sattam bin Abdulaziz University, Alkharj 11942, Saudi Arabia
| | - Gabriel Avelino Sampedro
- Faculty of Information and Communication Studies, University of the Philippines Open University, Philippines and Center for Computational Imaging and Visual Innovations, De La Salle University, Los Baños 4031, 2401 Taft Ave., Malate, Manila 1004, Philippines
| | - Mideth Abisado
- College of Computing and Information Technologies, National University, Manila, Philippines
| |
Collapse
|
9
|
Dos Reis Carvalho A, da Silva MV, Comin CH. Artificial vascular image generation using blood vessel texture maps. Comput Biol Med 2024; 183:109226. [PMID: 39378578 DOI: 10.1016/j.compbiomed.2024.109226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Revised: 09/25/2024] [Accepted: 09/27/2024] [Indexed: 10/10/2024]
Abstract
BACKGROUND Current methods for identifying blood vessels in digital images typically involve training neural networks on pixel-wise annotated data. However, manually outlining whole vessel trees in images tends to be very costly. One approach for reducing the amount of manual annotation is to pre-train networks on artificially generated vessel images. Recent pre-training approaches focus on generating proper artificial geometries for the vessels, while the appearance of the vessels is defined using general statistics of the real samples or generative networks requiring an additional training procedure to be defined. In contrast, we propose a methodology for generating blood vessels with realistic textures extracted directly from manually annotated vessel segments from real samples. The method allows the generation of artificial images having blood vessels with similar geometry and texture to the real samples using only a handful of manually annotated vessels. METHODS The first step of the method is the manual annotation of the borders of a small vessel segment, which takes only a few seconds. The annotation is then used for creating a reference image containing the texture of the vessel, called a texture map. A procedure is then defined to allow texture maps to be placed on top of any smooth curve using a piecewise linear transformation. Artificial images are then created by generating a set of vessel geometries using Bézier curves and assigning vessel texture maps to the curves. RESULTS The method is validated on a fluorescence microscopy (CORTEX) and a fundus photography (DRIVE) dataset. We show that manually annotating only 0.03% of the vessels in the CORTEX dataset allows pre-training a network to reach, on average, a Dice score of 0.87 ± 0.02, which is close to the baseline score of 0.92 obtained when all vessels of the training split of the dataset are annotated. For the DRIVE dataset, on average, a Dice score of 0.74 ± 0.02 is obtained by annotating only 0.29% of the vessels, which is also close to the baseline Dice score of 0.81 obtained when all vessels are annotated. CONCLUSION The proposed method can be used for disentangling the geometry and texture of blood vessels, which allows a significant improvement of network pre-training performance when compared to other pre-training methods commonly used in the literature.
Collapse
Affiliation(s)
| | - Matheus Viana da Silva
- Department of Computer Science, Federal University of São Carlos, São Carlos, SP, Brazil
| | - Cesar H Comin
- Department of Computer Science, Federal University of São Carlos, São Carlos, SP, Brazil.
| |
Collapse
|
10
|
Lim G, Elangovan K, Jin L. Vision language models in ophthalmology. Curr Opin Ophthalmol 2024; 35:487-493. [PMID: 39259649 DOI: 10.1097/icu.0000000000001089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/13/2024]
Abstract
PURPOSE OF REVIEW Vision Language Models are an emerging paradigm in artificial intelligence that offers the potential to natively analyze both image and textual data simultaneously, within a single model. The fusion of these two modalities is of particular relevance to ophthalmology, which has historically involved specialized imaging techniques such as angiography, optical coherence tomography, and fundus photography, while also interfacing with electronic health records that include free text descriptions. This review then surveys the fast-evolving field of Vision Language Models as they apply to current ophthalmologic research and practice. RECENT FINDINGS Although models incorporating both image and text data have a long provenance in ophthalmology, effective multimodal Vision Language Models are a recent development exploiting advances in technologies such as transformer and autoencoder models. SUMMARY Vision Language Models offer the potential to assist and streamline the existing clinical workflow in ophthalmology, whether previsit, during, or post-visit. There are, however, also important challenges to be overcome, particularly regarding patient privacy and explainability of model recommendations.
Collapse
|
11
|
Huang K, Ma X, Zhang Z, Zhang Y, Yuan S, Fu H, Chen Q. Diverse Data Generation for Retinal Layer Segmentation With Potential Structure Modeling. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:3584-3595. [PMID: 38587957 DOI: 10.1109/tmi.2024.3384484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/10/2024]
Abstract
Accurate retinal layer segmentation on optical coherence tomography (OCT) images is hampered by the challenges of collecting OCT images with diverse pathological characterization and balanced distribution. Current generative models can produce high-realistic images and corresponding labels without quantitative limitations by fitting distributions of real collected data. Nevertheless, the diversity of their generated data is still limited due to the inherent imbalance of training data. To address these issues, we propose an image-label pair generation framework that generates diverse and balanced potential data from imbalanced real samples. Specifically, the framework first generates diverse layer masks, and then generates plausible OCT images corresponding to these layer masks using two customized diffusion probabilistic models respectively. To learn from imbalanced data and facilitate balanced generation, we introduce pathological-related conditions to guide the generation processes. To enhance the diversity of the generated image-label pairs, we propose a potential structure modeling technique that transfers the knowledge of diverse sub-structures from lowly- or non-pathological samples to highly pathological samples. We conducted extensive experiments on two public datasets for retinal layer segmentation. Firstly, our method generates OCT images with higher image quality and diversity compared to other generative methods. Furthermore, based on the extensive training with the generated OCT images, downstream retinal layer segmentation tasks demonstrate improved results. The code is publicly available at: https://github.com/nicetomeetu21/GenPSM.
Collapse
|
12
|
Zhou W, Wang X, Yang X, Hu Y, Yi Y. Skeleton-guided multi-scale dual-coordinate attention aggregation network for retinal blood vessel segmentation. Comput Biol Med 2024; 181:109027. [PMID: 39178808 DOI: 10.1016/j.compbiomed.2024.109027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 06/06/2024] [Accepted: 08/12/2024] [Indexed: 08/26/2024]
Abstract
Deep learning plays a pivotal role in retinal blood vessel segmentation for medical diagnosis. Despite their significant efficacy, these techniques face two major challenges. Firstly, they often neglect the severe class imbalance in fundus images, where thin vessels in the foreground are proportionally minimal. Secondly, they are susceptible to poor image quality and blurred vessel edges, resulting in discontinuities or breaks in vascular structures. In response, this paper proposes the Skeleton-guided Multi-scale Dual-coordinate Attention Aggregation (SMDAA) network for retinal vessel segmentation. SMDAA comprises three innovative modules: Dual-coordinate Attention (DCA), Unbalanced Pixel Amplifier (UPA), and Vessel Skeleton Guidance (VSG). DCA, integrating Multi-scale Coordinate Feature Aggregation (MCFA) and Scale Coordinate Attention Decoding (SCAD), meticulously analyzes vessel structures across various scales, adept at capturing intricate details, thereby significantly enhancing segmentation accuracy. To address class imbalance, we introduce UPA, dynamically allocating more attention to misclassified pixels, ensuring precise extraction of thin and small blood vessels. Moreover, to preserve vessel structure continuity, we integrate vessel anatomy and develop the VSG module to connect fragmented vessel segments. Additionally, a Feature-level Contrast (FCL) loss is introduced to capture subtle nuances within the same category, enhancing the fidelity of retinal blood vessel segmentation. Extensive experiments on three public datasets (DRIVE, STARE, and CHASE_DB1) demonstrate superior performance in comparison to current methods. The code is available at https://github.com/wangwxr/SMDAA_NET.
Collapse
Affiliation(s)
- Wei Zhou
- College of Computer Science, Shenyang Aerospace University, Shenyang, China
| | - Xiaorui Wang
- College of Computer Science, Shenyang Aerospace University, Shenyang, China
| | - Xuekun Yang
- College of Computer Science, Shenyang Aerospace University, Shenyang, China
| | - Yangtao Hu
- Department of Ophthalmology, The 908th Hospital of Chinese People's Liberation Army Joint Logistic SupportForce, Nanchang, China.
| | - Yugen Yi
- School of Software, Jiangxi Normal University, Nanchang, China.
| |
Collapse
|
13
|
He Z, Lu N, Chen Y, Chun-Sing Chui E, Liu Z, Qin X, Li J, Wang S, Yang J, Wang Z, Wang Y, Qiu Y, Yuk-Wai Lee W, Chun-Yiu Cheng J, Yang KG, Yiu-Chung Lau A, Liu X, Chen X, Li WJ, Zhu Z. Conditional generative adversarial network-assisted system for radiation-free evaluation of scoliosis using a single smartphone photograph: a model development and validation study. EClinicalMedicine 2024; 75:102779. [PMID: 39252864 PMCID: PMC11381623 DOI: 10.1016/j.eclinm.2024.102779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Revised: 07/21/2024] [Accepted: 07/23/2024] [Indexed: 09/11/2024] Open
Abstract
Background Adolescent idiopathic scoliosis (AIS) is the most common spinal disorder in children, characterized by insidious onset and rapid progression, which can lead to severe consequences if not detected in a timely manner. Currently, the diagnosis of AIS primarily relies on X-ray imaging. However, due to limitations in healthcare access and concerns over radiation exposure, this diagnostic method cannot be widely adopted. Therefore, we have developed and validated a screening system using deep learning technology, capable of generating virtual X-ray images (VXI) from two-dimensional Red Green Blue (2D-RGB) images captured by a smartphone or camera to assist spine surgeons in the rapid, accurate, and non-invasive assessment of AIS. Methods We included 2397 patients with AIS and 48 potential patients with AIS who visited four medical institutions in mainland China from June 11th 2014 to November 28th 2023. Participants data included standing full-spine X-ray images captured by radiology technicians and 2D-RGB images taken by spine surgeons using a camera. We developed a deep learning model based on conditional generative adversarial networks (cGAN) called Swin-pix2pix to generate VXI on retrospective training (n = 1842) and validation (n = 100) dataset, then validated the performance of VXI in quantifying the curve type and severity of AIS on retrospective internal (n = 100), external (n = 135), and prospective test datasets (n = 268). The prospective test dataset included 268 participants treated in Nanjing, China, from April 19th, 2023, to November 28th, 2023, comprising 220 patients with AIS and 48 potential patients with AIS. Their data underwent strict quality control to ensure optimal data quality and consistency. Findings Our Swin-pix2pix model generated realistic VXI, with the mean absolute error (MAE) for predicting the main and secondary Cobb angles of AIS significantly lower than other baseline cGAN models, at 3.2° and 3.1° on prospective test dataset. The diagnostic accuracy for scoliosis severity grading exceeded that of two spine surgery experts, with accuracy of 0.93 (95% CI [0.91, 0.95]) in main curve and 0.89 (95% CI [0.87, 0.91]) in secondary curve. For main curve position and curve classification, the predictive accuracy of the Swin-pix2pix model also surpassed that of the baseline cGAN models, with accuracy of 0.93 (95% CI [0.90, 0.95]) for thoracic curve and 0.97 (95% CI [0.96, 0.98]), achieving satisfactory results on three external datasets as well. Interpretation Our developed Swin-pix2pix model holds promise for using a single photo taken with a smartphone or camera to rapidly assess AIS curve type and severity without radiation, enabling large-scale screening. However, limited data quality and quantity, a homogeneous participant population, and rotational errors during imaging may affect the applicability and accuracy of the system, requiring further improvement in the future. Funding National Key R&D Program of China, Natural Science Foundation of Jiangsu Province, China Postdoctoral Science Foundation, Nanjing Medical Science and Technology Development Foundation, Jiangsu Provincial Key Research and Development Program, and Jiangsu Provincial Medical Innovation Centre of Orthopedic Surgery.
Collapse
Affiliation(s)
- Zhong He
- Division of Spine Surgery, Department of Orthopedic Surgery, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, China
| | - Neng Lu
- National Key Laboratory for Novel Software Technology, Department of Computer Science and Technology, Nanjing University, Nanjing, China
| | - Yi Chen
- Division of Spine Surgery, Department of Orthopedic Surgery, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, China
| | - Elvis Chun-Sing Chui
- Department of Orthopaedics and Traumatology, The Chinese University of Hong Kong, Hong Kong, China
| | - Zhen Liu
- Division of Spine Surgery, Department of Orthopedic Surgery, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, China
| | - Xiaodong Qin
- Division of Spine Surgery, Department of Orthopedic Surgery, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, China
| | - Jie Li
- Division of Spine Surgery, Department of Orthopedic Surgery, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, China
| | - Shengru Wang
- Department of Orthopedics, Peking Union Medical College Hospital, Beijing, China
| | - Junlin Yang
- Spine Center, Xinhua Hospital Affiliated to Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Zhiwei Wang
- Department of Orthopaedic Surgery, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Yimu Wang
- David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Canada
| | - Yong Qiu
- Division of Spine Surgery, Department of Orthopedic Surgery, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, China
| | - Wayne Yuk-Wai Lee
- Department of Orthopaedics and Traumatology, The Chinese University of Hong Kong, Hong Kong, China
| | - Jack Chun-Yiu Cheng
- Department of Orthopaedics and Traumatology, The Chinese University of Hong Kong, Hong Kong, China
| | - Kenneth Guangpu Yang
- Department of Orthopaedics and Traumatology, The Chinese University of Hong Kong, Hong Kong, China
| | - Adam Yiu-Chung Lau
- Department of Orthopaedics and Traumatology, The Chinese University of Hong Kong, Hong Kong, China
| | - Xiaoli Liu
- Department of Orthopaedics and Traumatology, The Chinese University of Hong Kong, Hong Kong, China
| | - Xipu Chen
- Division of Spine Surgery, Department of Orthopedic Surgery, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, China
| | - Wu-Jun Li
- National Key Laboratory for Novel Software Technology, Department of Computer Science and Technology, Nanjing University, Nanjing, China
- Center of Medical Big Data, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, China
- National Institute of Healthcare Data Science at Nanjing University, Nanjing, China
| | - Zezhang Zhu
- Division of Spine Surgery, Department of Orthopedic Surgery, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, China
| |
Collapse
|
14
|
Kreitner L, Paetzold JC, Rauch N, Chen C, Hagag AM, Fayed AE, Sivaprasad S, Rausch S, Weichsel J, Menze BH, Harders M, Knier B, Rueckert D, Menten MJ. Synthetic Optical Coherence Tomography Angiographs for Detailed Retinal Vessel Segmentation Without Human Annotations. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2061-2073. [PMID: 38224512 DOI: 10.1109/tmi.2024.3354408] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/17/2024]
Abstract
Optical coherence tomography angiography (OCTA) is a non-invasive imaging modality that can acquire high-resolution volumes of the retinal vasculature and aid the diagnosis of ocular, neurological and cardiac diseases. Segmenting the visible blood vessels is a common first step when extracting quantitative biomarkers from these images. Classical segmentation algorithms based on thresholding are strongly affected by image artifacts and limited signal-to-noise ratio. The use of modern, deep learning-based segmentation methods has been inhibited by a lack of large datasets with detailed annotations of the blood vessels. To address this issue, recent work has employed transfer learning, where a segmentation network is trained on synthetic OCTA images and is then applied to real data. However, the previously proposed simulations fail to faithfully model the retinal vasculature and do not provide effective domain adaptation. Because of this, current methods are unable to fully segment the retinal vasculature, in particular the smallest capillaries. In this work, we present a lightweight simulation of the retinal vascular network based on space colonization for faster and more realistic OCTA synthesis. We then introduce three contrast adaptation pipelines to decrease the domain gap between real and artificial images. We demonstrate the superior segmentation performance of our approach in extensive quantitative and qualitative experiments on three public datasets that compare our method to traditional computer vision algorithms and supervised training using human annotations. Finally, we make our entire pipeline publicly available, including the source code, pretrained models, and a large dataset of synthetic OCTA images.
Collapse
|
15
|
Ktena I, Wiles O, Albuquerque I, Rebuffi SA, Tanno R, Roy AG, Azizi S, Belgrave D, Kohli P, Cemgil T, Karthikesalingam A, Gowal S. Generative models improve fairness of medical classifiers under distribution shifts. Nat Med 2024; 30:1166-1173. [PMID: 38600282 PMCID: PMC11031395 DOI: 10.1038/s41591-024-02838-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 01/26/2024] [Indexed: 04/12/2024]
Abstract
Domain generalization is a ubiquitous challenge for machine learning in healthcare. Model performance in real-world conditions might be lower than expected because of discrepancies between the data encountered during deployment and development. Underrepresentation of some groups or conditions during model development is a common cause of this phenomenon. This challenge is often not readily addressed by targeted data acquisition and 'labeling' by expert clinicians, which can be prohibitively expensive or practically impossible because of the rarity of conditions or the available clinical expertise. We hypothesize that advances in generative artificial intelligence can help mitigate this unmet need in a steerable fashion, enriching our training dataset with synthetic examples that address shortfalls of underrepresented conditions or subgroups. We show that diffusion models can automatically learn realistic augmentations from data in a label-efficient manner. We demonstrate that learned augmentations make models more robust and statistically fair in-distribution and out of distribution. To evaluate the generality of our approach, we studied three distinct medical imaging contexts of varying difficulty: (1) histopathology, (2) chest X-ray and (3) dermatology images. Complementing real samples with synthetic ones improved the robustness of models in all three medical tasks and increased fairness by improving the accuracy of clinical diagnosis within underrepresented groups, especially out of distribution.
Collapse
|
16
|
Wang W, Xia Q, Yan Z, Hu Z, Chen Y, Zheng W, Wang X, Nie S, Metaxas D, Zhang S. AVDNet: Joint coronary artery and vein segmentation with topological consistency. Med Image Anal 2024; 91:102999. [PMID: 37862866 DOI: 10.1016/j.media.2023.102999] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 09/28/2023] [Accepted: 10/10/2023] [Indexed: 10/22/2023]
Abstract
Coronary CT angiography (CCTA) is an effective and non-invasive method for coronary artery disease diagnosis. Extracting an accurate coronary artery tree from CCTA image is essential for centerline extraction, plaque detection, and stenosis quantification. In practice, data quality varies. Sometimes, the arteries and veins have similar intensities and locate closely, which may confuse segmentation algorithms, even deep learning based ones, to obtain accurate arteries. However, it is not always feasible to re-scan the patient for better image quality. In this paper, we propose an artery and vein disentanglement network (AVDNet) for robust and accurate segmentation by incorporating the coronary vein into the segmentation task. This is the first work to segment coronary artery and vein at the same time. The AVDNet consists of an image based vessel recognition network (IVRN) and a topology based vessel refinement network (TVRN). IVRN learns to segment the arteries and veins, while TVRN learns to correct the segmentation errors based on topology consistency. We also design a novel inverse distance weighted dice (IDD) loss function to recover more thin vessel branches and preserve the vascular boundaries. Extensive experiments are conducted on a multi-center dataset of 700 patients. Quantitative and qualitative results demonstrate the effectiveness of the proposed method by comparing it with state-of-the-art methods and different variants. Prediction results of the AVDNet on the Automated Segmentation of Coronary Artery Challenge dataset are avaliabel at https://github.com/WennyJJ/Coronary-Artery-Vein-Segmentation for follow-up research.
Collapse
Affiliation(s)
- Wenji Wang
- SenseTime Research, Beijing, 100080, China.
| | - Qing Xia
- SenseTime Research, Beijing, 100080, China.
| | | | | | - Yinan Chen
- SenseTime Research, Beijing, 100080, China
| | - Wen Zheng
- Center for Coronary Artery Disease, Beijing Anzhen Hospital, Capital Medical University, Beijing, 100029, China
| | - Xiao Wang
- Center for Coronary Artery Disease, Beijing Anzhen Hospital, Capital Medical University, Beijing, 100029, China
| | - Shaoping Nie
- Center for Coronary Artery Disease, Beijing Anzhen Hospital, Capital Medical University, Beijing, 100029, China
| | - Dimitris Metaxas
- Department of Computer Science, Rutgers University, NJ, 08854, USA
| | - Shaoting Zhang
- SenseTime Research, Beijing, 100080, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200032, China
| |
Collapse
|
17
|
Chen R, Liu M, Chen W, Wang Y, Meijering E. Deep learning in mesoscale brain image analysis: A review. Comput Biol Med 2023; 167:107617. [PMID: 37918261 DOI: 10.1016/j.compbiomed.2023.107617] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 10/06/2023] [Accepted: 10/23/2023] [Indexed: 11/04/2023]
Abstract
Mesoscale microscopy images of the brain contain a wealth of information which can help us understand the working mechanisms of the brain. However, it is a challenging task to process and analyze these data because of the large size of the images, their high noise levels, the complex morphology of the brain from the cellular to the regional and anatomical levels, the inhomogeneous distribution of fluorescent labels in the cells and tissues, and imaging artifacts. Due to their impressive ability to extract relevant information from images, deep learning algorithms are widely applied to microscopy images of the brain to address these challenges and they perform superiorly in a wide range of microscopy image processing and analysis tasks. This article reviews the applications of deep learning algorithms in brain mesoscale microscopy image processing and analysis, including image synthesis, image segmentation, object detection, and neuron reconstruction and analysis. We also discuss the difficulties of each task and possible directions for further research.
Collapse
Affiliation(s)
- Runze Chen
- College of Electrical and Information Engineering, National Engineering Laboratory for Robot Visual Perception and Control Technology, Hunan University, Changsha, 410082, China
| | - Min Liu
- College of Electrical and Information Engineering, National Engineering Laboratory for Robot Visual Perception and Control Technology, Hunan University, Changsha, 410082, China; Research Institute of Hunan University in Chongqing, Chongqing, 401135, China.
| | - Weixun Chen
- College of Electrical and Information Engineering, National Engineering Laboratory for Robot Visual Perception and Control Technology, Hunan University, Changsha, 410082, China
| | - Yaonan Wang
- College of Electrical and Information Engineering, National Engineering Laboratory for Robot Visual Perception and Control Technology, Hunan University, Changsha, 410082, China
| | - Erik Meijering
- School of Computer Science and Engineering, University of New South Wales, Sydney 2052, New South Wales, Australia
| |
Collapse
|
18
|
Lin L, Peng L, He H, Cheng P, Wu J, Wong KKY, Tang X. YoloCurvSeg: You only label one noisy skeleton for vessel-style curvilinear structure segmentation. Med Image Anal 2023; 90:102937. [PMID: 37672901 DOI: 10.1016/j.media.2023.102937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2023] [Revised: 06/30/2023] [Accepted: 08/16/2023] [Indexed: 09/08/2023]
Abstract
Weakly-supervised learning (WSL) has been proposed to alleviate the conflict between data annotation cost and model performance through employing sparsely-grained (i.e., point-, box-, scribble-wise) supervision and has shown promising performance, particularly in the image segmentation field. However, it is still a very challenging task due to the limited supervision, especially when only a small number of labeled samples are available. Additionally, almost all existing WSL segmentation methods are designed for star-convex structures which are very different from curvilinear structures such as vessels and nerves. In this paper, we propose a novel sparsely annotated segmentation framework for curvilinear structures, named YoloCurvSeg. A very essential component of YoloCurvSeg is image synthesis. Specifically, a background generator delivers image backgrounds that closely match the real distributions through inpainting dilated skeletons. The extracted backgrounds are then combined with randomly emulated curves generated by a Space Colonization Algorithm-based foreground generator and through a multilayer patch-wise contrastive learning synthesizer. In this way, a synthetic dataset with both images and curve segmentation labels is obtained, at the cost of only one or a few noisy skeleton annotations. Finally, a segmenter is trained with the generated dataset and possibly an unlabeled dataset. The proposed YoloCurvSeg is evaluated on four publicly available datasets (OCTA500, CORN, DRIVE and CHASEDB1) and the results show that YoloCurvSeg outperforms state-of-the-art WSL segmentation methods by large margins. With only one noisy skeleton annotation (respectively 0.14%, 0.03%, 1.40%, and 0.65% of the full annotation), YoloCurvSeg achieves more than 97% of the fully-supervised performance on each dataset. Code and datasets will be released at https://github.com/llmir/YoloCurvSeg.
Collapse
Affiliation(s)
- Li Lin
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China; Department of Electrical and Electronic Engineering, the University of Hong Kong, Hong Kong, China; Jiaxing Research Institute, Southern University of Science and Technology, Jiaxing, China
| | - Linkai Peng
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China
| | - Huaqing He
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China; Jiaxing Research Institute, Southern University of Science and Technology, Jiaxing, China
| | - Pujin Cheng
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China; Jiaxing Research Institute, Southern University of Science and Technology, Jiaxing, China
| | - Jiewei Wu
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China
| | - Kenneth K Y Wong
- Department of Electrical and Electronic Engineering, the University of Hong Kong, Hong Kong, China
| | - Xiaoying Tang
- Department of Electronic and Electrical Engineering, Southern University of Science and Technology, Shenzhen, China; Jiaxing Research Institute, Southern University of Science and Technology, Jiaxing, China.
| |
Collapse
|
19
|
Hou N, Shi J, Ding X, Nie C, Wang C, Wan J. ROP-GAN: an image synthesis method for retinopathy of prematurity based on generative adversarial network. Phys Med Biol 2023; 68:205016. [PMID: 37619572 DOI: 10.1088/1361-6560/acf3c9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 08/24/2023] [Indexed: 08/26/2023]
Abstract
Objective. Training data with annotations are scarce in the intelligent diagnosis of retinopathy of prematurity (ROP), and existing typical data augmentation methods cannot generate data with a high degree of diversity. In order to increase the sample size and the generalization ability of the classification model, we propose a method called ROP-GAN for image synthesis of ROP based on a generative adversarial network.Approach. To generate a binary vascular network from color fundus images, we first design an image segmentation model based on U2-Net that can extract multi-scale features without reducing the resolution of the feature map. The vascular network is then fed into an adversarial autoencoder for reconstruction, which increases the diversity of the vascular network diagram. Then, we design an ROP image synthesis algorithm based on a generative adversarial network, in which paired color fundus images and binarized vascular networks are input into the image generation model to train the generator and discriminator, and attention mechanism modules are added to the generator to improve its detail synthesis ability.Main results. Qualitative and quantitative evaluation indicators are applied to evaluate the proposed method, and experiments demonstrate that the proposed method is superior to the existing ROP image synthesis methods, as it can synthesize realistic ROP fundus images.Significance. Our method effectively alleviates the problem of data imbalance in ROP intelligent diagnosis, contributes to the implementation of ROP staging tasks, and lays the foundation for further research. In addition to classification tasks, our synthesized images can facilitate tasks that require large amounts of medical data, such as detecting lesions and segmenting medical images.
Collapse
Affiliation(s)
- Ning Hou
- School of Mechanical and Automotive Engineering, South China University of Technology, Guangzhou 510641, People's Republic of China
| | - Jianhua Shi
- School of Mechanical and Electrical Engineering, Shanxi Datong University, Shanxi 037009, People's Republic of China
| | - Xiaoxuan Ding
- School of Mechanical and Automotive Engineering, South China University of Technology, Guangzhou 510641, People's Republic of China
| | - Chuan Nie
- Department of Neonatology, Guangdong Women and Children Hospital, Guangzhou 511442, People's Republic of China
| | - Cuicui Wang
- Graduate School, Guangzhou Medical University, Guangzhou 511495, People's Republic of China
| | - Jiafu Wan
- School of Mechanical and Automotive Engineering, South China University of Technology, Guangzhou 510641, People's Republic of China
| |
Collapse
|
20
|
Xie Y, Wan Q, Xie H, Xu Y, Wang T, Wang S, Lei B. Fundus Image-Label Pairs Synthesis and Retinopathy Screening via GANs With Class-Imbalanced Semi-Supervised Learning. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:2714-2725. [PMID: 37030825 DOI: 10.1109/tmi.2023.3263216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Retinopathy is the primary cause of irreversible yet preventable blindness. Numerous deep-learning algorithms have been developed for automatic retinal fundus image analysis. However, existing methods are usually data-driven, which rarely consider the costs associated with fundus image collection and annotation, along with the class-imbalanced distribution that arises from the relative scarcity of disease-positive individuals in the population. Semi-supervised learning on class-imbalanced data, despite a realistic problem, has been relatively little studied. To fill the existing research gap, we explore generative adversarial networks (GANs) as a potential answer to that problem. Specifically, we present a novel framework, named CISSL-GANs, for class-imbalanced semi-supervised learning (CISSL) by leveraging a dynamic class-rebalancing (DCR) sampler, which exploits the property that the classifier trained on class-imbalanced data produces high-precision pseudo-labels on minority classes to leverage the bias inherent in pseudo-labels. Also, given the well-known difficulty of training GANs on complex data, we investigate three practical techniques to improve the training dynamics without altering the global equilibrium. Experimental results demonstrate that our CISSL-GANs are capable of simultaneously improving fundus image class-conditional generation and classification performance under a typical label insufficient and imbalanced scenario. Our code is available at: https://github.com/Xyporz/CISSL-GANs.
Collapse
|
21
|
Musha A, Hasnat R, Mamun AA, Ping EP, Ghosh T. Computer-Aided Bleeding Detection Algorithms for Capsule Endoscopy: A Systematic Review. SENSORS (BASEL, SWITZERLAND) 2023; 23:7170. [PMID: 37631707 PMCID: PMC10459126 DOI: 10.3390/s23167170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 08/08/2023] [Accepted: 08/10/2023] [Indexed: 08/27/2023]
Abstract
Capsule endoscopy (CE) is a widely used medical imaging tool for the diagnosis of gastrointestinal tract abnormalities like bleeding. However, CE captures a huge number of image frames, constituting a time-consuming and tedious task for medical experts to manually inspect. To address this issue, researchers have focused on computer-aided bleeding detection systems to automatically identify bleeding in real time. This paper presents a systematic review of the available state-of-the-art computer-aided bleeding detection algorithms for capsule endoscopy. The review was carried out by searching five different repositories (Scopus, PubMed, IEEE Xplore, ACM Digital Library, and ScienceDirect) for all original publications on computer-aided bleeding detection published between 2001 and 2023. The Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) methodology was used to perform the review, and 147 full texts of scientific papers were reviewed. The contributions of this paper are: (I) a taxonomy for computer-aided bleeding detection algorithms for capsule endoscopy is identified; (II) the available state-of-the-art computer-aided bleeding detection algorithms, including various color spaces (RGB, HSV, etc.), feature extraction techniques, and classifiers, are discussed; and (III) the most effective algorithms for practical use are identified. Finally, the paper is concluded by providing future direction for computer-aided bleeding detection research.
Collapse
Affiliation(s)
- Ahmmad Musha
- Department of Electrical and Electronic Engineering, Pabna University of Science and Technology, Pabna 6600, Bangladesh; (A.M.); (R.H.)
| | - Rehnuma Hasnat
- Department of Electrical and Electronic Engineering, Pabna University of Science and Technology, Pabna 6600, Bangladesh; (A.M.); (R.H.)
| | - Abdullah Al Mamun
- Faculty of Engineering and Technology, Multimedia University, Melaka 75450, Malaysia;
| | - Em Poh Ping
- Faculty of Engineering and Technology, Multimedia University, Melaka 75450, Malaysia;
| | - Tonmoy Ghosh
- Department of Electrical and Computer Engineering, The University of Alabama, Tuscaloosa, AL 35487, USA;
| |
Collapse
|
22
|
Ong J, Waisberg E, Masalkhi M, Kamran SA, Lowry K, Sarker P, Zaman N, Paladugu P, Tavakkoli A, Lee AG. Artificial Intelligence Frameworks to Detect and Investigate the Pathophysiology of Spaceflight Associated Neuro-Ocular Syndrome (SANS). Brain Sci 2023; 13:1148. [PMID: 37626504 PMCID: PMC10452366 DOI: 10.3390/brainsci13081148] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 07/24/2023] [Accepted: 07/28/2023] [Indexed: 08/27/2023] Open
Abstract
Spaceflight associated neuro-ocular syndrome (SANS) is a unique phenomenon that has been observed in astronauts who have undergone long-duration spaceflight (LDSF). The syndrome is characterized by distinct imaging and clinical findings including optic disc edema, hyperopic refractive shift, posterior globe flattening, and choroidal folds. SANS serves a large barrier to planetary spaceflight such as a mission to Mars and has been noted by the National Aeronautics and Space Administration (NASA) as a high risk based on its likelihood to occur and its severity to human health and mission performance. While it is a large barrier to future spaceflight, the underlying etiology of SANS is not well understood. Current ophthalmic imaging onboard the International Space Station (ISS) has provided further insights into SANS. However, the spaceflight environment presents with unique challenges and limitations to further understand this microgravity-induced phenomenon. The advent of artificial intelligence (AI) has revolutionized the field of imaging in ophthalmology, particularly in detection and monitoring. In this manuscript, we describe the current hypothesized pathophysiology of SANS and the medical diagnostic limitations during spaceflight to further understand its pathogenesis. We then introduce and describe various AI frameworks that can be applied to ophthalmic imaging onboard the ISS to further understand SANS including supervised/unsupervised learning, generative adversarial networks, and transfer learning. We conclude by describing current research in this area to further understand SANS with the goal of enabling deeper insights into SANS and safer spaceflight for future missions.
Collapse
Affiliation(s)
- Joshua Ong
- Department of Ophthalmology and Visual Sciences, University of Michigan Kellogg Eye Center, Ann Arbor, MI 48105, USA
| | | | - Mouayad Masalkhi
- University College Dublin School of Medicine, Belfield, Dublin 4, Ireland
| | - Sharif Amit Kamran
- Human-Machine Perception Laboratory, Department of Computer Science and Engineering, University of Nevada, Reno, NV 89512, USA
| | | | - Prithul Sarker
- Human-Machine Perception Laboratory, Department of Computer Science and Engineering, University of Nevada, Reno, NV 89512, USA
| | - Nasif Zaman
- Human-Machine Perception Laboratory, Department of Computer Science and Engineering, University of Nevada, Reno, NV 89512, USA
| | - Phani Paladugu
- Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
- Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA 19107, USA
| | - Alireza Tavakkoli
- Human-Machine Perception Laboratory, Department of Computer Science and Engineering, University of Nevada, Reno, NV 89512, USA
| | - Andrew G. Lee
- Center for Space Medicine, Baylor College of Medicine, Houston, TX 77030, USA
- Department of Ophthalmology, Blanton Eye Institute, Houston Methodist Hospital, Houston, TX 77030, USA
- The Houston Methodist Research Institute, Houston Methodist Hospital, Houston, TX 77030, USA
- Departments of Ophthalmology, Neurology, and Neurosurgery, Weill Cornell Medicine, New York, NY 10065, USA
- Department of Ophthalmology, University of Texas Medical Branch, Galveston, TX 77555, USA
- University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- Texas A&M College of Medicine, Bryan, TX 77030, USA
- Department of Ophthalmology, The University of Iowa Hospitals and Clinics, Iowa City, IA 50010, USA
| |
Collapse
|
23
|
Wang Z, Lim G, Ng WY, Tan TE, Lim J, Lim SH, Foo V, Lim J, Sinisterra LG, Zheng F, Liu N, Tan GSW, Cheng CY, Cheung GCM, Wong TY, Ting DSW. Synthetic artificial intelligence using generative adversarial network for retinal imaging in detection of age-related macular degeneration. Front Med (Lausanne) 2023; 10:1184892. [PMID: 37425325 PMCID: PMC10324667 DOI: 10.3389/fmed.2023.1184892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2023] [Accepted: 05/30/2023] [Indexed: 07/11/2023] Open
Abstract
Introduction Age-related macular degeneration (AMD) is one of the leading causes of vision impairment globally and early detection is crucial to prevent vision loss. However, the screening of AMD is resource dependent and demands experienced healthcare providers. Recently, deep learning (DL) systems have shown the potential for effective detection of various eye diseases from retinal fundus images, but the development of such robust systems requires a large amount of datasets, which could be limited by prevalence of the disease and privacy of patient. As in the case of AMD, the advanced phenotype is often scarce for conducting DL analysis, which may be tackled via generating synthetic images using Generative Adversarial Networks (GANs). This study aims to develop GAN-synthesized fundus photos with AMD lesions, and to assess the realness of these images with an objective scale. Methods To build our GAN models, a total of 125,012 fundus photos were used from a real-world non-AMD phenotypical dataset. StyleGAN2 and human-in-the-loop (HITL) method were then applied to synthesize fundus images with AMD features. To objectively assess the quality of the synthesized images, we proposed a novel realness scale based on the frequency of the broken vessels observed in the fundus photos. Four residents conducted two rounds of gradings on 300 images to distinguish real from synthetic images, based on their subjective impression and the objective scale respectively. Results and discussion The introduction of HITL training increased the percentage of synthetic images with AMD lesions, despite the limited number of AMD images in the initial training dataset. Qualitatively, the synthesized images have been proven to be robust in that our residents had limited ability to distinguish real from synthetic ones, as evidenced by an overall accuracy of 0.66 (95% CI: 0.61-0.66) and Cohen's kappa of 0.320. For the non-referable AMD classes (no or early AMD), the accuracy was only 0.51. With the objective scale, the overall accuracy improved to 0.72. In conclusion, GAN models built with HITL training are capable of producing realistic-looking fundus images that could fool human experts, while our objective realness scale based on broken vessels can help identifying the synthetic fundus photos.
Collapse
Affiliation(s)
- Zhaoran Wang
- Duke-NUS Medical School, National University of Singapore, Singapore, Singapore
| | - Gilbert Lim
- Duke-NUS Medical School, National University of Singapore, Singapore, Singapore
- Singapore Eye Research Institute, Singapore, Singapore
| | - Wei Yan Ng
- Singapore Eye Research Institute, Singapore, Singapore
- Singapore National Eye Centre, Singapore, Singapore
| | - Tien-En Tan
- Singapore Eye Research Institute, Singapore, Singapore
- Singapore National Eye Centre, Singapore, Singapore
| | - Jane Lim
- Singapore Eye Research Institute, Singapore, Singapore
- Singapore National Eye Centre, Singapore, Singapore
| | - Sing Hui Lim
- Singapore Eye Research Institute, Singapore, Singapore
- Singapore National Eye Centre, Singapore, Singapore
| | - Valencia Foo
- Singapore Eye Research Institute, Singapore, Singapore
- Singapore National Eye Centre, Singapore, Singapore
| | - Joshua Lim
- Singapore Eye Research Institute, Singapore, Singapore
- Singapore National Eye Centre, Singapore, Singapore
| | | | - Feihui Zheng
- Singapore Eye Research Institute, Singapore, Singapore
| | - Nan Liu
- Duke-NUS Medical School, National University of Singapore, Singapore, Singapore
- Singapore Eye Research Institute, Singapore, Singapore
| | - Gavin Siew Wei Tan
- Duke-NUS Medical School, National University of Singapore, Singapore, Singapore
- Singapore Eye Research Institute, Singapore, Singapore
- Singapore National Eye Centre, Singapore, Singapore
| | - Ching-Yu Cheng
- Duke-NUS Medical School, National University of Singapore, Singapore, Singapore
- Singapore Eye Research Institute, Singapore, Singapore
- Singapore National Eye Centre, Singapore, Singapore
| | - Gemmy Chui Ming Cheung
- Duke-NUS Medical School, National University of Singapore, Singapore, Singapore
- Singapore Eye Research Institute, Singapore, Singapore
- Singapore National Eye Centre, Singapore, Singapore
| | - Tien Yin Wong
- Singapore National Eye Centre, Singapore, Singapore
- School of Medicine, Tsinghua University, Beijing, China
| | - Daniel Shu Wei Ting
- Duke-NUS Medical School, National University of Singapore, Singapore, Singapore
- Singapore Eye Research Institute, Singapore, Singapore
- Singapore National Eye Centre, Singapore, Singapore
| |
Collapse
|
24
|
Veturi YA, Woof W, Lazebnik T, Moghul I, Woodward-Court P, Wagner SK, Cabral de Guimarães TA, Daich Varela M, Liefers B, Patel PJ, Beck S, Webster AR, Mahroo O, Keane PA, Michaelides M, Balaskas K, Pontikos N. SynthEye: Investigating the Impact of Synthetic Data on Artificial Intelligence-assisted Gene Diagnosis of Inherited Retinal Disease. OPHTHALMOLOGY SCIENCE 2023; 3:100258. [PMID: 36685715 PMCID: PMC9852957 DOI: 10.1016/j.xops.2022.100258] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 11/08/2022] [Accepted: 11/09/2022] [Indexed: 11/23/2022]
Abstract
Purpose Rare disease diagnosis is challenging in medical image-based artificial intelligence due to a natural class imbalance in datasets, leading to biased prediction models. Inherited retinal diseases (IRDs) are a research domain that particularly faces this issue. This study investigates the applicability of synthetic data in improving artificial intelligence-enabled diagnosis of IRDs using generative adversarial networks (GANs). Design Diagnostic study of gene-labeled fundus autofluorescence (FAF) IRD images using deep learning. Participants Moorfields Eye Hospital (MEH) dataset of 15 692 FAF images obtained from 1800 patients with confirmed genetic diagnosis of 1 of 36 IRD genes. Methods A StyleGAN2 model is trained on the IRD dataset to generate 512 × 512 resolution images. Convolutional neural networks are trained for classification using different synthetically augmented datasets, including real IRD images plus 1800 and 3600 synthetic images, and a fully rebalanced dataset. We also perform an experiment with only synthetic data. All models are compared against a baseline convolutional neural network trained only on real data. Main Outcome Measures We evaluated synthetic data quality using a Visual Turing Test conducted with 4 ophthalmologists from MEH. Synthetic and real images were compared using feature space visualization, similarity analysis to detect memorized images, and Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE) score for no-reference-based quality evaluation. Convolutional neural network diagnostic performance was determined on a held-out test set using the area under the receiver operating characteristic curve (AUROC) and Cohen's Kappa (κ). Results An average true recognition rate of 63% and fake recognition rate of 47% was obtained from the Visual Turing Test. Thus, a considerable proportion of the synthetic images were classified as real by clinical experts. Similarity analysis showed that the synthetic images were not copies of the real images, indicating that copied real images, meaning the GAN was able to generalize. However, BRISQUE score analysis indicated that synthetic images were of significantly lower quality overall than real images (P < 0.05). Comparing the rebalanced model (RB) with the baseline (R), no significant change in the average AUROC and κ was found (R-AUROC = 0.86[0.85-88], RB-AUROC = 0.88[0.86-0.89], R-k = 0.51[0.49-0.53], and RB-k = 0.52[0.50-0.54]). The synthetic data trained model (S) achieved similar performance as the baseline (S-AUROC = 0.86[0.85-87], S-k = 0.48[0.46-0.50]). Conclusions Synthetic generation of realistic IRD FAF images is feasible. Synthetic data augmentation does not deliver improvements in classification performance. However, synthetic data alone deliver a similar performance as real data, and hence may be useful as a proxy to real data. Financial Disclosure(s): Proprietary or commercial disclosure may be found after the references.
Collapse
Key Words
- AUROC, area under the receiver operating characteristic curve
- BRISQUE, Blind/Referenceless Image Spatial Quality Evaluator
- Class imbalance
- Clinical Decision-Support Model
- DL, deep learning
- Deep Learning
- FAF, fundas autofluorescence
- FRR, Fake Recognition Rate
- GAN, generative adversarial network
- Generative Adversarial Networks
- IRD, inherited retinal disease
- Inherited Retinal Diseases
- MEH, Moorfields Eye Hospital
- R, baseline model
- RB, rebalanced model
- S, synthetic data trained model
- Synthetic data
- TRR, True Recognition Rate
- UMAP, Universal Manifold Approximation and Projection
Collapse
Affiliation(s)
- Yoga Advaith Veturi
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | - William Woof
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | - Teddy Lazebnik
- University College London Cancer Institute, University College London, London, UK
| | | | - Peter Woodward-Court
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | - Siegfried K. Wagner
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | | | - Malena Daich Varela
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | | | | | - Stephan Beck
- University College London Cancer Institute, University College London, London, UK
| | - Andrew R. Webster
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | - Omar Mahroo
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | - Pearse A. Keane
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | - Michel Michaelides
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | - Konstantinos Balaskas
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | - Nikolas Pontikos
- University College London Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| |
Collapse
|
25
|
Tan Y, Zhao SX, Yang KF, Li YJ. A lightweight network guided with differential matched filtering for retinal vessel segmentation. Comput Biol Med 2023; 160:106924. [PMID: 37146492 DOI: 10.1016/j.compbiomed.2023.106924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 04/03/2023] [Accepted: 04/13/2023] [Indexed: 05/07/2023]
Abstract
The geometric morphology of retinal vessels reflects the state of cardiovascular health, and fundus images are important reference materials for ophthalmologists. Great progress has been made in automated vessel segmentation, but few studies have focused on thin vessel breakage and false-positives in areas with lesions or low contrast. In this work, we propose a new network, differential matched filtering guided attention UNet (DMF-AU), to address these issues, incorporating a differential matched filtering layer, feature anisotropic attention, and a multiscale consistency constrained backbone to perform thin vessel segmentation. The differential matched filtering is used for the early identification of locally linear vessels, and the resulting rough vessel map guides the backbone to learn vascular details. Feature anisotropic attention reinforces the vessel features of spatial linearity at each stage of the model. Multiscale constraints reduce the loss of vessel information while pooling within large receptive fields. In tests on multiple classical datasets, the proposed model performed well compared with other algorithms on several specially designed criteria for vessel segmentation. DMF-AU is a high-performance, lightweight vessel segmentation model. The source code is at https://github.com/tyb311/DMF-AU.
Collapse
Affiliation(s)
- Yubo Tan
- The MOE Key Laboratory for Neuroinformation, Radiation Oncology Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, China.
| | - Shi-Xuan Zhao
- The MOE Key Laboratory for Neuroinformation, Radiation Oncology Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, China.
| | - Kai-Fu Yang
- The MOE Key Laboratory for Neuroinformation, Radiation Oncology Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, China.
| | - Yong-Jie Li
- The MOE Key Laboratory for Neuroinformation, Radiation Oncology Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, China.
| |
Collapse
|
26
|
Xia Y, Ravikumar N, Lassila T, Frangi AF. Virtual high-resolution MR angiography from non-angiographic multi-contrast MRIs: synthetic vascular model populations for in-silico trials. Med Image Anal 2023; 87:102814. [PMID: 37196537 DOI: 10.1016/j.media.2023.102814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 04/04/2023] [Accepted: 04/08/2023] [Indexed: 05/19/2023]
Abstract
Despite success on multi-contrast MR image synthesis, generating specific modalities remains challenging. Those include Magnetic Resonance Angiography (MRA) that highlights details of vascular anatomy using specialised imaging sequences for emphasising inflow effect. This work proposes an end-to-end generative adversarial network that can synthesise anatomically plausible, high-resolution 3D MRA images using commonly acquired multi-contrast MR images (e.g. T1/T2/PD-weighted MR images) for the same subject whilst preserving the continuity of vascular anatomy. A reliable technique for MRA synthesis would unleash the research potential of very few population databases with imaging modalities (such as MRA) that enable quantitative characterisation of whole-brain vasculature. Our work is motivated by the need to generate digital twins and virtual patients of cerebrovascular anatomy for in-silico studies and/or in-silico trials. We propose a dedicated generator and discriminator that leverage the shared and complementary features of multi-source images. We design a composite loss function for emphasising vascular properties by minimising the statistical difference between the feature representations of the target images and the synthesised outputs in both 3D volumetric and 2D projection domains. Experimental results show that the proposed method can synthesise high-quality MRA images and outperform the state-of-the-art generative models both qualitatively and quantitatively. The importance assessment reveals that T2 and PD-weighted images are better predictors of MRA images than T1; and PD-weighted images contribute to better visibility of small vessel branches towards the peripheral regions. In addition, the proposed approach can generalise to unseen data acquired at different imaging centres with different scanners, whilst synthesising MRAs and vascular geometries that maintain vessel continuity. The results show the potential for use of the proposed approach to generating digital twin cohorts of cerebrovascular anatomy at scale from structural MR images typically acquired in population imaging initiatives.
Collapse
Affiliation(s)
- Yan Xia
- Centre for Computational Imaging and Simulation Technologies in Biomedicine (CISTIB), School of Computing, University of Leeds, Leeds, UK.
| | - Nishant Ravikumar
- Centre for Computational Imaging and Simulation Technologies in Biomedicine (CISTIB), School of Computing, University of Leeds, Leeds, UK
| | - Toni Lassila
- Centre for Computational Imaging and Simulation Technologies in Biomedicine (CISTIB), School of Computing, University of Leeds, Leeds, UK
| | - Alejandro F Frangi
- Centre for Computational Imaging and Simulation Technologies in Biomedicine (CISTIB), School of Computing, University of Leeds, Leeds, UK; Leeds Institute for Cardiovascular and Metabolic Medicine (LICAMM), School of Medicine, University of Leeds, Leeds, UK; Medical Imaging Research Center (MIRC), Cardiovascular Science and Electronic Engineering Departments, KU Leuven, Leuven, Belgium; Alan Turing Institute, London, UK
| |
Collapse
|
27
|
Exploring healthy retinal aging with deep learning. OPHTHALMOLOGY SCIENCE 2023; 3:100294. [PMID: 37113474 PMCID: PMC10127123 DOI: 10.1016/j.xops.2023.100294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Revised: 01/24/2023] [Accepted: 02/17/2023] [Indexed: 03/04/2023]
Abstract
Purpose To study the individual course of retinal changes caused by healthy aging using deep learning. Design Retrospective analysis of a large data set of retinal OCT images. Participants A total of 85 709 adults between the age of 40 and 75 years of whom OCT images were acquired in the scope of the UK Biobank population study. Methods We created a counterfactual generative adversarial network (GAN), a type of neural network that learns from cross-sectional, retrospective data. It then synthesizes high-resolution counterfactual OCT images and longitudinal time series. These counterfactuals allow visualization and analysis of hypothetical scenarios in which certain characteristics of the imaged subject, such as age or sex, are altered, whereas other attributes, crucially the subject's identity and image acquisition settings, remain fixed. Main Outcome Measures Using our counterfactual GAN, we investigated subject-specific changes in the retinal layer structure as a function of age and sex. In particular, we measured changes in the retinal nerve fiber layer (RNFL), combined ganglion cell layer plus inner plexiform layer (GCIPL), inner nuclear layer to the inner boundary of the retinal pigment epithelium (INL-RPE), and retinal pigment epithelium (RPE). Results Our counterfactual GAN is able to smoothly visualize the individual course of retinal aging. Across all counterfactual images, the RNFL, GCIPL, INL-RPE, and RPE changed by -0.1 μm ± 0.1 μm, -0.5 μm ± 0.2 μm, -0.2 μm ± 0.1 μm, and 0.1 μm ± 0.1 μm, respectively, per decade of age. These results agree well with previous studies based on the same cohort from the UK Biobank population study. Beyond population-wide average measures, our counterfactual GAN allows us to explore whether the retinal layers of a given eye will increase in thickness, decrease in thickness, or stagnate as a subject ages. Conclusion This study demonstrates how counterfactual GANs can aid research into retinal aging by generating high-resolution, high-fidelity OCT images, and longitudinal time series. Ultimately, we envision that they will enable clinical experts to derive and explore hypotheses for potential imaging biomarkers for healthy and pathologic aging that can be refined and tested in prospective clinical trials. Financial Disclosures Proprietary or commercial disclosure may be found after the references.
Collapse
|
28
|
Liu Y, Yang F, Yang Y. A partial convolution generative adversarial network for lesion synthesis and enhanced liver tumor segmentation. J Appl Clin Med Phys 2023; 24:e13927. [PMID: 36800255 PMCID: PMC10113707 DOI: 10.1002/acm2.13927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 01/05/2023] [Accepted: 01/17/2023] [Indexed: 02/18/2023] Open
Abstract
Lesion segmentation is critical for clinicians to accurately stage the disease and determine treatment strategy. Deep learning based automatic segmentation can improve both the segmentation efficiency and accuracy. However, training a robust deep learning segmentation model requires sufficient training examples with sufficient diversity in lesion location and lesion size. This study is to develop a deep learning framework for generation of synthetic lesions with various locations and sizes that can be included in the training dataset to enhance the lesion segmentation performance. The lesion synthesis network is a modified generative adversarial network (GAN). Specifically, we innovated a partial convolution strategy to construct a U-Net-like generator. The discriminator is designed using Wasserstein GAN with gradient penalty and spectral normalization. A mask generation method based on principal component analysis (PCA) was developed to model various lesion shapes. The generated masks are then converted into liver lesions through a lesion synthesis network. The lesion synthesis framework was evaluated for lesion textures, and the synthetic lesions were used to train a lesion segmentation network to further validate the effectiveness of the lesion synthesis framework. All the networks are trained and tested on the LITS public dataset. Our experiments demonstrate that the synthetic lesions generated by our approach have very similar distributions for the two parameters, GLCM-energy and GLCM-correlation. Including the synthetic lesions in the segmentation network improved the segmentation dice performance from 67.3% to 71.4%. Meanwhile, the precision and sensitivity for lesion segmentation were improved from 74.6% to 76.0% and 66.1% to 70.9%, respectively. The proposed lesion synthesis approach outperforms the other two existing approaches. Including the synthetic lesion data into the training dataset significantly improves the segmentation performance.
Collapse
Affiliation(s)
- Yingao Liu
- Department of Engineering and Applied Physics, University of Science and Technology of China, Hefei, Anhui, China
| | - Fei Yang
- Department of Radiation Oncology, University of Miami School of Medicine, Miami, Florida, USA
| | - Yidong Yang
- Department of Radiation Oncology, the First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China.,School of Physical Sciences & the Ion Medical Research Institute, University of Science and Technology of China, Hefei, Anhui, China
| |
Collapse
|
29
|
Wisaeng K. Retinal Blood-Vessel Extraction Using Weighted Kernel Fuzzy C-Means Clustering and Dilation-Based Functions. Diagnostics (Basel) 2023; 13:diagnostics13030342. [PMID: 36766446 PMCID: PMC9914389 DOI: 10.3390/diagnostics13030342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 01/04/2023] [Accepted: 01/09/2023] [Indexed: 01/19/2023] Open
Abstract
Automated blood-vessel extraction is essential in diagnosing Diabetic Retinopathy (DR) and other eye-related diseases. However, the traditional methods for extracting blood vessels tend to provide low accuracy when dealing with difficult situations, such as extracting both micro and large blood vessels simultaneously with low-intensity images and blood vessels with DR. This paper proposes a complete preprocessing method to enhance original retinal images before transferring the enhanced images to a novel blood-vessel extraction method by a combined three extraction stages. The first stage focuses on the fast extraction of retinal blood vessels using Weighted Kernel Fuzzy C-Means (WKFCM) Clustering to draw the vessel feature from the retinal background. The second stage focuses on the accuracy of full-size images to achieve regional vessel feature recognition of large and micro blood vessels and to minimize false extraction. This stage implements the mathematical dilation operator from a trained model called Dilation-Based Function (DBF). Finally, an optimal parameter threshold is empirically determined in the third stage to remove non-vessel features in the binary image and improve the overall vessel extraction results. According to evaluations of the method via the datasets DRIVE, STARE, and DiaretDB0, the proposed WKFCM-DBF method achieved sensitivities, specificities, and accuracy performances of 98.12%, 98.20%, and 98.16%, 98.42%, 98.80%, and 98.51%, and 98.89%, 98.10%, and 98.09%, respectively.
Collapse
Affiliation(s)
- Kittipol Wisaeng
- Technology and Business Information System Unit, Mahasarakham Business School, Mahasarakham University, Mahasarakham 44150, Thailand
| |
Collapse
|
30
|
Image-to-image translation with Generative Adversarial Networks via retinal masks for realistic Optical Coherence Tomography imaging of Diabetic Macular Edema disorders. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
31
|
Liu C, Wang D, Zhang H, Wu W, Sun W, Zhao T, Zheng N. Using Simulated Training Data of Voxel-Level Generative Models to Improve 3D Neuron Reconstruction. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:3624-3635. [PMID: 35834465 DOI: 10.1109/tmi.2022.3191011] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Reconstructing neuron morphologies from fluorescence microscope images plays a critical role in neuroscience studies. It relies on image segmentation to produce initial masks either for further processing or final results to represent neuronal morphologies. This has been a challenging step due to the variation and complexity of noisy intensity patterns in neuron images acquired from microscopes. Whereas progresses in deep learning have brought the goal of accurate segmentation much closer to reality, creating training data for producing powerful neural networks is often laborious. To overcome the difficulty of obtaining a vast number of annotated data, we propose a novel strategy of using two-stage generative models to simulate training data with voxel-level labels. Trained upon unlabeled data by optimizing a novel objective function of preserving predefined labels, the models are able to synthesize realistic 3D images with underlying voxel labels. We showed that these synthetic images could train segmentation networks to obtain even better performance than manually labeled data. To demonstrate an immediate impact of our work, we further showed that segmentation results produced by networks trained upon synthetic data could be used to improve existing neuron reconstruction methods.
Collapse
|
32
|
Zhao M, Lu Z, Zhu S, Wang X, Feng J. Automatic generation of retinal optical coherence tomography images based on generative adversarial networks. Med Phys 2022; 49:7357-7367. [PMID: 36122302 DOI: 10.1002/mp.15988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 07/13/2022] [Accepted: 08/28/2022] [Indexed: 12/13/2022] Open
Abstract
SIGNIFICANCE The automatic generation algorithm of optical coherence tomography (OCT) images based on generative adversarial networks (GAN) can generate a large number of simulation images by a relatively small number of real images, which can effectively improve the classification performance. AIM We proposed an automatic generation algorithm for retinal OCT images based on GAN to alleviate the problem of insufficient images with high quality in deep learning, and put the diagnosis algorithm toward clinical application. APPROACH We designed a generation network based on GAN and trained the network with a data set constructed by 2014_BOE_Srinivasan and OCT2017 to acquire three models. Then, we generated a large number of images by the three models to augment age-related macular degeneration (AMD), diabetic macular edema (DME), and normal images. We evaluated the generated images by subjective visual observation, Fréchet inception distance (FID) scores, and a classification experiment. RESULTS Visual observation shows that the generated images have clear and similar features compared with the real images. Also, the lesion regions containing similar features in the real image and the generated image are randomly distributed in the image field of view. When the FID scores of the three types of generated images are lowest, three local optimal models are obtained for AMD, DME, and normal images, indicating the generated images have high quality and diversity. Moreover, the classification experiment results show that the model performance trained with the mixed images is better than that of the model trained with real images, in which the accuracy, sensitivity, and specificity are improved by 5.56%, 8.89%, and 2.22%. In addition, compared with the generation method based on variational auto-encoder (VAE), the method improved the accuracy, sensitivity, and specificity by 1.97%, 2.97%, and 0.99%, for the same test set. CONCLUSIONS The results show that our method can augment the three kinds of OCT images, not only effectively alleviating the problem of insufficient images with high quality but also improving the diagnosis performance.
Collapse
Affiliation(s)
- Mengmeng Zhao
- Beijing University of Technology, Intelligent Physiological Measurement and Clinical Translation, International Base for Science and Technology Cooperation, Department of Biomedical Engineering, Beijing University of Technology, Beijing, China
| | - Zhenzhen Lu
- Beijing University of Technology, Intelligent Physiological Measurement and Clinical Translation, International Base for Science and Technology Cooperation, Department of Biomedical Engineering, Beijing University of Technology, Beijing, China
| | - Shuyuan Zhu
- Beijing University of Technology, Intelligent Physiological Measurement and Clinical Translation, International Base for Science and Technology Cooperation, Department of Biomedical Engineering, Beijing University of Technology, Beijing, China
| | - Xiaobing Wang
- Capital University of Physical Education and Sports, Sports and Medicine Integrative Innovation Center, Capital University of Physical Education and Sports, Beijing, China
| | - Jihong Feng
- Beijing University of Technology, Intelligent Physiological Measurement and Clinical Translation, International Base for Science and Technology Cooperation, Department of Biomedical Engineering, Beijing University of Technology, Beijing, China
| |
Collapse
|
33
|
Kugelman J, Alonso-Caneiro D, Read SA, Collins MJ. A review of generative adversarial network applications in optical coherence tomography image analysis. JOURNAL OF OPTOMETRY 2022; 15 Suppl 1:S1-S11. [PMID: 36241526 PMCID: PMC9732473 DOI: 10.1016/j.optom.2022.09.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 08/19/2022] [Accepted: 09/20/2022] [Indexed: 06/16/2023]
Abstract
Optical coherence tomography (OCT) has revolutionized ophthalmic clinical practice and research, as a result of the high-resolution images that the method is able to capture in a fast, non-invasive manner. Although clinicians can interpret OCT images qualitatively, the ability to quantitatively and automatically analyse these images represents a key goal for eye care by providing clinicians with immediate and relevant metrics to inform best clinical practice. The range of applications and methods to analyse OCT images is rich and rapidly expanding. With the advent of deep learning methods, the field has experienced significant progress with state-of-the-art-performance for several OCT image analysis tasks. Generative adversarial networks (GANs) represent a subfield of deep learning that allows for a range of novel applications not possible in most other deep learning methods, with the potential to provide more accurate and robust analyses. In this review, the progress in this field and clinical impact are reviewed and the potential future development of applications of GANs to OCT image processing are discussed.
Collapse
Affiliation(s)
- Jason Kugelman
- Queensland University of Technology (QUT), Contact Lens and Visual Optics Laboratory, Centre for Vision and Eye Research, School of Optometry and Vision Science, Kelvin Grove, QLD 4059, Australia.
| | - David Alonso-Caneiro
- Queensland University of Technology (QUT), Contact Lens and Visual Optics Laboratory, Centre for Vision and Eye Research, School of Optometry and Vision Science, Kelvin Grove, QLD 4059, Australia
| | - Scott A Read
- Queensland University of Technology (QUT), Contact Lens and Visual Optics Laboratory, Centre for Vision and Eye Research, School of Optometry and Vision Science, Kelvin Grove, QLD 4059, Australia
| | - Michael J Collins
- Queensland University of Technology (QUT), Contact Lens and Visual Optics Laboratory, Centre for Vision and Eye Research, School of Optometry and Vision Science, Kelvin Grove, QLD 4059, Australia
| |
Collapse
|
34
|
da Silva MV, Ouellette J, Lacoste B, Comin CH. An analysis of the influence of transfer learning when measuring the tortuosity of blood vessels. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 225:107021. [PMID: 35914440 DOI: 10.1016/j.cmpb.2022.107021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Revised: 07/10/2022] [Accepted: 07/11/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND AND OBJECTIVE Convolutional Neural Networks (CNNs) can provide excellent results regarding the segmentation of blood vessels. One important aspect of CNNs is that they can be trained on large amounts of data and then be made available, for instance, in image processing software. The pre-trained CNNs can then be easily applied in downstream blood vessel characterization tasks, such as the calculation of the length, tortuosity, or caliber of the blood vessels. Yet, it is still unclear if pre-trained CNNs can provide robust, unbiased, results in downstream tasks involving the morphological analysis of blood vessels. Here, we focus on measuring the tortuosity of blood vessels and investigate to which extent CNNs may provide biased tortuosity values even after fine-tuning the network to a new dataset under study. METHODS We develop a procedure for quantifying the influence of CNN pre-training in downstream analyses involving the measurement of morphological properties of blood vessels. Using the methodology, we compare the performance of CNNs that were trained on images containing blood vessels having high tortuosity with CNNs that were trained on blood vessels with low tortuosity and fine-tuned on blood vessels with high tortuosity. The opposite situation is also investigated. RESULTS We show that the tortuosity values obtained by a CNN trained from scratch on a dataset may not agree with those obtained by a fine-tuned network that was pre-trained on a dataset having different tortuosity statistics. In addition, we show that improving the segmentation accuracy does not necessarily lead to better tortuosity estimation. To mitigate the aforementioned issues, we propose the application of data augmentation techniques even in situations where they do not improve segmentation performance. For instance, we found that the application of elastic transformations was enough to prevent an underestimation of 8% of blood vessel tortuosity when applying CNNs to different datasets. CONCLUSIONS The results highlight the importance of developing new methodologies for training CNNs with the specific goal of reducing the error of morphological measurements, as opposed to the traditional approach of using segmentation accuracy as a proxy metric for performance evaluation.
Collapse
Affiliation(s)
- Matheus V da Silva
- Department of Computer Science, Federal University of São Carlos, São Carlos, SP, Brazil
| | - Julie Ouellette
- Department of Cellular and Molecular Medicine, Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada; Neuroscience Program, Ottawa Hospital Research Institute, Ottawa, ON, Canada
| | - Baptiste Lacoste
- Department of Cellular and Molecular Medicine, Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada
| | - Cesar H Comin
- Department of Computer Science, Federal University of São Carlos, São Carlos, SP, Brazil.
| |
Collapse
|
35
|
Sreejith Kumar AJ, Chong RS, Crowston JG, Chua J, Bujor I, Husain R, Vithana EN, Girard MJA, Ting DSW, Cheng CY, Aung T, Popa-Cherecheanu A, Schmetterer L, Wong D. Evaluation of Generative Adversarial Networks for High-Resolution Synthetic Image Generation of Circumpapillary Optical Coherence Tomography Images for Glaucoma. JAMA Ophthalmol 2022; 140:974-981. [PMID: 36048435 PMCID: PMC9437828 DOI: 10.1001/jamaophthalmol.2022.3375] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Importance Deep learning (DL) networks require large data sets for training, which can be challenging to collect clinically. Generative models could be used to generate large numbers of synthetic optical coherence tomography (OCT) images to train such DL networks for glaucoma detection. Objective To assess whether generative models can synthesize circumpapillary optic nerve head OCT images of normal and glaucomatous eyes and determine the usability of synthetic images for training DL models for glaucoma detection. Design, Setting, and Participants Progressively growing generative adversarial network models were trained to generate circumpapillary OCT scans. Image gradeability and authenticity were evaluated on a clinical set of 100 real and 100 synthetic images by 2 clinical experts. DL networks for glaucoma detection were trained with real or synthetic images and evaluated on independent internal and external test data sets of 140 and 300 real images, respectively. Main Outcomes and Measures Evaluations of the clinical set between the experts were compared. Glaucoma detection performance of the DL networks was assessed using area under the curve (AUC) analysis. Class activation maps provided visualizations of the regions contributing to the respective classifications. Results A total of 990 normal and 862 glaucomatous eyes were analyzed. Evaluations of the clinical set were similar for gradeability (expert 1: 92.0%; expert 2: 93.0%) and authenticity (expert 1: 51.8%; expert 2: 51.3%). The best-performing DL network trained on synthetic images had AUC scores of 0.97 (95% CI, 0.95-0.99) on the internal test data set and 0.90 (95% CI, 0.87-0.93) on the external test data set, compared with AUCs of 0.96 (95% CI, 0.94-0.99) on the internal test data set and 0.84 (95% CI, 0.80-0.87) on the external test data set for the network trained with real images. An increase in the AUC for the synthetic DL network was observed with the use of larger synthetic data set sizes. Class activation maps showed that the regions of the synthetic images contributing to glaucoma detection were generally similar to that of real images. Conclusions and Relevance DL networks trained with synthetic OCT images for glaucoma detection were comparable with networks trained with real images. These results suggest potential use of generative models in the training of DL networks and as a means of data sharing across institutions without patient information confidentiality issues.
Collapse
Affiliation(s)
- Ashish Jith Sreejith Kumar
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore.,SERI-NTU Advanced Ocular Engineering (STANCE), Singapore, Singapore.,Institute for Infocomm Research, A*STAR, Singapore
| | - Rachel S Chong
- Academic Clinical Program, Duke-NUS Medical School, Singapore
| | - Jonathan G Crowston
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore.,Academic Clinical Program, Duke-NUS Medical School, Singapore
| | - Jacqueline Chua
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore.,Academic Clinical Program, Duke-NUS Medical School, Singapore
| | - Inna Bujor
- Carol Davila University of Medicine and Pharmacy, Bucharest, Romania
| | - Rahat Husain
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore.,Academic Clinical Program, Duke-NUS Medical School, Singapore
| | - Eranga N Vithana
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore.,Academic Clinical Program, Duke-NUS Medical School, Singapore
| | - Michaël J A Girard
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore.,Academic Clinical Program, Duke-NUS Medical School, Singapore.,Institute of Molecular and Clinical Ophthalmology, Basel, Switzerland
| | - Daniel S W Ting
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore.,Academic Clinical Program, Duke-NUS Medical School, Singapore
| | - Ching-Yu Cheng
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore.,Academic Clinical Program, Duke-NUS Medical School, Singapore.,Institute of Molecular and Clinical Ophthalmology, Basel, Switzerland
| | - Tin Aung
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore.,Academic Clinical Program, Duke-NUS Medical School, Singapore.,Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Alina Popa-Cherecheanu
- Carol Davila University of Medicine and Pharmacy, Bucharest, Romania.,Emergency University Hospital, Department of Ophthalmology, Bucharest, Romania
| | - Leopold Schmetterer
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore.,Academic Clinical Program, Duke-NUS Medical School, Singapore.,SERI-NTU Advanced Ocular Engineering (STANCE), Singapore, Singapore.,Institute of Molecular and Clinical Ophthalmology, Basel, Switzerland.,Department of Ophthalmology and Optometry, Medical University Vienna, Vienna, Austria.,School of Chemical and Biomedical Engineering, Nanyang Technological University, Singapore.,Department of Clinical Pharmacology, Medical University Vienna, Vienna, Austria.,Center for Medical Physics and Biomedical Engineering, Medical University Vienna, Vienna, Austria
| | - Damon Wong
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore.,SERI-NTU Advanced Ocular Engineering (STANCE), Singapore, Singapore.,School of Chemical and Biomedical Engineering, Nanyang Technological University, Singapore
| |
Collapse
|
36
|
Guo X, Lu X, Lin Q, Zhang J, Hu X, Che S. A novel retinal image generation model with the preservation of structural similarity and high resolution. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.104004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
37
|
Elwin JGR, Mandala J, Maram B, Kumar RR. Ar-HGSO: Autoregressive-Henry Gas Sailfish Optimization enabled deep learning model for diabetic retinopathy detection and severity level classification. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
38
|
Qayyum A, Sultani W, Shamshad F, Tufail R, Qadir J. Single-shot retinal image enhancement using untrained and pretrained neural networks priors integrated with analytical image priors. Comput Biol Med 2022; 148:105879. [PMID: 35863248 DOI: 10.1016/j.compbiomed.2022.105879] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 06/20/2022] [Accepted: 07/09/2022] [Indexed: 01/08/2023]
Abstract
Retinal images acquired using fundus cameras are often visually blurred due to imperfect imaging conditions, refractive medium turbidity, and motion blur. In addition, ocular diseases such as the presence of cataracts also result in blurred retinal images. The presence of blur in retinal fundus images reduces the effectiveness of the diagnosis process of an expert ophthalmologist or a computer-aided detection/diagnosis system. In this paper, we put forward a single-shot deep image prior (DIP)-based approach for retinal image enhancement. Unlike typical deep learning-based approaches, our method does not require any training data. Instead, our DIP-based method can learn the underlying image prior while using a single degraded image. To perform retinal image enhancement, we frame it as a layer decomposition problem and investigate the use of two well-known analytical priors, i.e., dark channel prior (DCP) and bright channel prior (BCP) for atmospheric light estimation. We show that both the untrained neural networks and the pretrained neural networks can be used to generate an enhanced image while using only a single degraded image. The proposed approach is time and memory-efficient, which makes the solution feasible for real-world resource-constrained environments. We evaluate our proposed framework quantitatively on five datasets using three widely used metrics and complement that with a subjective qualitative assessment of the enhancement by two expert ophthalmologists. For instance, our method has achieved significant performance for untrained CDIPs coupled with DCP in terms of average PSNR, SSIM, and BRISQUE values of 40.41, 0.97, and 34.2, respectively, and for untrained CDIPs coupled with BCP, it achieved average PSNR, SSIM, and BRISQUE values of 40.22, 0.98, and 36.38, respectively. Our extensive experimental comparison with several competitive baselines on public and non-public proprietary datasets validates the proposed ideas and framework.
Collapse
Affiliation(s)
- Adnan Qayyum
- Information Technology University of the Punjab, Lahore, Pakistan
| | - Waqas Sultani
- Information Technology University of the Punjab, Lahore, Pakistan
| | - Fahad Shamshad
- Information Technology University of the Punjab, Lahore, Pakistan
| | | | | |
Collapse
|
39
|
Narotamo H, Ouarne M, Franco CA, Silveira M. Synthetic Generation of 3D Microscopy Images using Generative Adversarial Networks. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:549-552. [PMID: 36086569 DOI: 10.1109/embc48229.2022.9871631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Fluorescence microscopy images of cell organelles enable the study of various complex biological processes. Recently, deep learning (DL) models are being used for the accurate automatic analysis of these images. DL models present state-of-the-art performance in many image analysis tasks such as object classification, segmentation and detection. However, to train a DL model a large manually annotated dataset is required. Manual annotation of 3D microscopy images is a time-consuming task and must be performed by specialists in the area. Thus, only a few images with annotations are typically available. Recent advances in generative adversarial networks (GANs) have allowed the translation of images with some conditions into realistic looking synthetic images. Therefore, in this work we explore approaches based on GANs to create synthetic 3D microscopy images. We compare four approaches that differ in the conditions of the input image. The quality of the generated images was assessed visually and using a quantitative objective GAN evaluation metric. The results showed that the GAN is able to generate synthetic images similar to the real ones. Hence, we have presented a method based on GANs to overcome the issue of small annotated datasets in the biomedical imaging field.
Collapse
|
40
|
Li J, Chen H, Li Y, Peng Y, Sun J, Pan P. Cross-modality synthesis aiding lung tumor segmentation on multi-modal MRI images. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103655] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
41
|
MinimalGAN: diverse medical image synthesis for data augmentation using minimal training data. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03609-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
42
|
Tubular shape aware data generation for segmentation in medical imaging. Int J Comput Assist Radiol Surg 2022; 17:1091-1099. [DOI: 10.1007/s11548-022-02621-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Accepted: 03/23/2022] [Indexed: 11/05/2022]
|
43
|
Chen Y, Yang XH, Wei Z, Heidari AA, Zheng N, Li Z, Chen H, Hu H, Zhou Q, Guan Q. Generative Adversarial Networks in Medical Image augmentation: A review. Comput Biol Med 2022; 144:105382. [PMID: 35276550 DOI: 10.1016/j.compbiomed.2022.105382] [Citation(s) in RCA: 96] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Revised: 02/25/2022] [Accepted: 03/02/2022] [Indexed: 12/31/2022]
Abstract
OBJECT With the development of deep learning, the number of training samples for medical image-based diagnosis and treatment models is increasing. Generative Adversarial Networks (GANs) have attracted attention in medical image processing due to their excellent image generation capabilities and have been widely used in data augmentation. In this paper, a comprehensive and systematic review and analysis of medical image augmentation work are carried out, and its research status and development prospects are reviewed. METHOD This paper reviews 105 medical image augmentation related papers, which mainly collected by ELSEVIER, IEEE Xplore, and Springer from 2018 to 2021. We counted these papers according to the parts of the organs corresponding to the images, and sorted out the medical image datasets that appeared in them, the loss function in model training, and the quantitative evaluation metrics of image augmentation. At the same time, we briefly introduce the literature collected in three journals and three conferences that have received attention in medical image processing. RESULT First, we summarize the advantages of various augmentation models, loss functions, and evaluation metrics. Researchers can use this information as a reference when designing augmentation tasks. Second, we explore the relationship between augmented models and the amount of the training set, and tease out the role that augmented models may play when the quality of the training set is limited. Third, the statistical number of papers shows that the development momentum of this research field remains strong. Furthermore, we discuss the existing limitations of this type of model and suggest possible research directions. CONCLUSION We discuss GAN-based medical image augmentation work in detail. This method effectively alleviates the challenge of limited training samples for medical image diagnosis and treatment models. It is hoped that this review will benefit researchers interested in this field.
Collapse
Affiliation(s)
- Yizhou Chen
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China.
| | - Xu-Hua Yang
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China.
| | - Zihan Wei
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China.
| | - Ali Asghar Heidari
- School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran, Tehran, Iran; Department of Computer Science, School of Computing, National University of Singapore, Singapore, Singapore.
| | - Nenggan Zheng
- Qiushi Academy for Advanced Studies, Zhejiang University, Hangzhou, Zhejiang, China.
| | - Zhicheng Li
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.
| | - Huiling Chen
- College of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, Zhejiang, 325035, China.
| | - Haigen Hu
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China.
| | - Qianwei Zhou
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China.
| | - Qiu Guan
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China.
| |
Collapse
|
44
|
Deng X, Ye J. A retinal blood vessel segmentation based on improved D-MNet and pulse-coupled neural network. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2021.103467] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
45
|
Lin E, Lin CH, Lane HY. De Novo Peptide and Protein Design Using Generative Adversarial Networks: An Update. J Chem Inf Model 2022; 62:761-774. [DOI: 10.1021/acs.jcim.1c01361] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Affiliation(s)
- Eugene Lin
- Department of Biostatistics, University of Washington, Seattle, Washington 98195, United States
- Department of Electrical & Computer Engineering, University of Washington, Seattle, Washington 98195, United States
- Graduate Institute of Biomedical Sciences, China Medical University, Taichung 40402, Taiwan
| | - Chieh-Hsin Lin
- Graduate Institute of Biomedical Sciences, China Medical University, Taichung 40402, Taiwan
- Department of Psychiatry, Kaohsiung Chang Gung Memorial Hospital, Chang Gung University College of Medicine, Kaohsiung 83301, Taiwan
- School of Medicine, Chang Gung University, Taoyuan 33302, Taiwan
| | - Hsien-Yuan Lane
- Graduate Institute of Biomedical Sciences, China Medical University, Taichung 40402, Taiwan
- Department of Psychiatry, China Medical University Hospital, Taichung 40447, Taiwan
- Brain Disease Research Center, China Medical University Hospital, Taichung 40447, Taiwan
- Department of Psychology, College of Medical and Health Sciences, Asia University, Taichung 41354, Taiwan
| |
Collapse
|
46
|
You A, Kim JK, Ryu IH, Yoo TK. Application of generative adversarial networks (GAN) for ophthalmology image domains: a survey. EYE AND VISION (LONDON, ENGLAND) 2022; 9:6. [PMID: 35109930 PMCID: PMC8808986 DOI: 10.1186/s40662-022-00277-3] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 01/11/2022] [Indexed: 12/12/2022]
Abstract
BACKGROUND Recent advances in deep learning techniques have led to improved diagnostic abilities in ophthalmology. A generative adversarial network (GAN), which consists of two competing types of deep neural networks, including a generator and a discriminator, has demonstrated remarkable performance in image synthesis and image-to-image translation. The adoption of GAN for medical imaging is increasing for image generation and translation, but it is not familiar to researchers in the field of ophthalmology. In this work, we present a literature review on the application of GAN in ophthalmology image domains to discuss important contributions and to identify potential future research directions. METHODS We performed a survey on studies using GAN published before June 2021 only, and we introduced various applications of GAN in ophthalmology image domains. The search identified 48 peer-reviewed papers in the final review. The type of GAN used in the analysis, task, imaging domain, and the outcome were collected to verify the usefulness of the GAN. RESULTS In ophthalmology image domains, GAN can perform segmentation, data augmentation, denoising, domain transfer, super-resolution, post-intervention prediction, and feature extraction. GAN techniques have established an extension of datasets and modalities in ophthalmology. GAN has several limitations, such as mode collapse, spatial deformities, unintended changes, and the generation of high-frequency noises and artifacts of checkerboard patterns. CONCLUSIONS The use of GAN has benefited the various tasks in ophthalmology image domains. Based on our observations, the adoption of GAN in ophthalmology is still in a very early stage of clinical validation compared with deep learning classification techniques because several problems need to be overcome for practical use. However, the proper selection of the GAN technique and statistical modeling of ocular imaging will greatly improve the performance of each image analysis. Finally, this survey would enable researchers to access the appropriate GAN technique to maximize the potential of ophthalmology datasets for deep learning research.
Collapse
Affiliation(s)
- Aram You
- School of Architecture, Kumoh National Institute of Technology, Gumi, Gyeongbuk, South Korea
| | - Jin Kuk Kim
- B&VIIT Eye Center, Seoul, South Korea
- VISUWORKS, Seoul, South Korea
| | - Ik Hee Ryu
- B&VIIT Eye Center, Seoul, South Korea
- VISUWORKS, Seoul, South Korea
| | - Tae Keun Yoo
- B&VIIT Eye Center, Seoul, South Korea.
- Department of Ophthalmology, Aerospace Medical Center, Republic of Korea Air Force, 635 Danjae-ro, Namil-myeon, Cheongwon-gun, Cheongju, Chungcheongbuk-do, 363-849, South Korea.
| |
Collapse
|
47
|
Review of Machine Learning Applications Using Retinal Fundus Images. Diagnostics (Basel) 2022; 12:diagnostics12010134. [PMID: 35054301 PMCID: PMC8774893 DOI: 10.3390/diagnostics12010134] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 01/03/2022] [Accepted: 01/03/2022] [Indexed: 02/04/2023] Open
Abstract
Automating screening and diagnosis in the medical field saves time and reduces the chances of misdiagnosis while saving on labor and cost for physicians. With the feasibility and development of deep learning methods, machines are now able to interpret complex features in medical data, which leads to rapid advancements in automation. Such efforts have been made in ophthalmology to analyze retinal images and build frameworks based on analysis for the identification of retinopathy and the assessment of its severity. This paper reviews recent state-of-the-art works utilizing the color fundus image taken from one of the imaging modalities used in ophthalmology. Specifically, the deep learning methods of automated screening and diagnosis for diabetic retinopathy (DR), age-related macular degeneration (AMD), and glaucoma are investigated. In addition, the machine learning techniques applied to the retinal vasculature extraction from the fundus image are covered. The challenges in developing these systems are also discussed.
Collapse
|
48
|
Li X, Jiang Y, Rodriguez-Andina JJ, Luo H, Yin S, Kaynak O. When medical images meet generative adversarial network: recent development and research opportunities. DISCOVER ARTIFICIAL INTELLIGENCE 2021; 1:5. [DOI: 10.1007/s44163-021-00006-0] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Accepted: 07/12/2021] [Indexed: 11/27/2022]
Abstract
AbstractDeep learning techniques have promoted the rise of artificial intelligence (AI) and performed well in computer vision. Medical image analysis is an important application of deep learning, which is expected to greatly reduce the workload of doctors, contributing to more sustainable health systems. However, most current AI methods for medical image analysis are based on supervised learning, which requires a lot of annotated data. The number of medical images available is usually small and the acquisition of medical image annotations is an expensive process. Generative adversarial network (GAN), an unsupervised method that has become very popular in recent years, can simulate the distribution of real data and reconstruct approximate real data. GAN opens some exciting new ways for medical image generation, expanding the number of medical images available for deep learning methods. Generated data can solve the problem of insufficient data or imbalanced data categories. Adversarial training is another contribution of GAN to medical imaging that has been applied to many tasks, such as classification, segmentation, or detection. This paper investigates the research status of GAN in medical images and analyzes several GAN methods commonly applied in this area. The study addresses GAN application for both medical image synthesis and adversarial learning for other medical image tasks. The open challenges and future research directions are also discussed.
Collapse
|
49
|
Chen JS, Coyner AS, Chan RP, Hartnett ME, Moshfeghi DM, Owen LA, Kalpathy-Cramer J, Chiang MF, Campbell JP. Deepfakes in Ophthalmology. OPHTHALMOLOGY SCIENCE 2021; 1:100079. [PMID: 36246951 PMCID: PMC9562356 DOI: 10.1016/j.xops.2021.100079] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Revised: 10/01/2021] [Accepted: 10/29/2021] [Indexed: 02/06/2023]
Abstract
Purpose Generative adversarial networks (GANs) are deep learning (DL) models that can create and modify realistic-appearing synthetic images, or deepfakes, from real images. The purpose of our study was to evaluate the ability of experts to discern synthesized retinal fundus images from real fundus images and to review the current uses and limitations of GANs in ophthalmology. Design Development and expert evaluation of a GAN and an informal review of the literature. Participants A total of 4282 image pairs of fundus images and retinal vessel maps acquired from a multicenter ROP screening program. Methods Pix2Pix HD, a high-resolution GAN, was first trained and validated on fundus and vessel map image pairs and subsequently used to generate 880 images from a held-out test set. Fifty synthetic images from this test set and 50 different real images were presented to 4 expert ROP ophthalmologists using a custom online system for evaluation of whether the images were real or synthetic. Literature was reviewed on PubMed and Google Scholars using combinations of the terms ophthalmology, GANs, generative adversarial networks, ophthalmology, images, deepfakes, and synthetic. Ancestor search was performed to broaden results. Main Outcome Measures Expert ability to discern real versus synthetic images was evaluated using percent accuracy. Statistical significance was evaluated using a Fisher exact test, with P values ≤ 0.05 thresholded for significance. Results The expert majority correctly identified 59% of images as being real or synthetic (P = 0.1). Experts 1 to 4 correctly identified 54%, 58%, 49%, and 61% of images (P = 0.505, 0.158, 1.000, and 0.043, respectively). These results suggest that the majority of experts could not discern between real and synthetic images. Additionally, we identified 20 implementations of GANs in the ophthalmology literature, with applications in a variety of imaging modalities and ophthalmic diseases. Conclusions Generative adversarial networks can create synthetic fundus images that are indiscernible from real fundus images by expert ROP ophthalmologists. Synthetic images may improve dataset augmentation for DL, may be used in trainee education, and may have implications for patient privacy.
Collapse
Affiliation(s)
- Jimmy S. Chen
- Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, Portland, Oregon
| | - Aaron S. Coyner
- Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, Portland, Oregon
| | - R.V. Paul Chan
- Department of Ophthalmology and Visual Sciences, University of Illinois at Chicago, Chicago, Illinois
| | - M. Elizabeth Hartnett
- Department of Ophthalmology, John A. Moran Eye Center, University of Utah, Salt Lake City, Utah
| | - Darius M. Moshfeghi
- Byers Eye Institute, Horngren Family Vitreoretinal Center, Department of Ophthalmology, Stanford University School of Medicine, Palo Alto, California
| | - Leah A. Owen
- Department of Ophthalmology, John A. Moran Eye Center, University of Utah, Salt Lake City, Utah
| | - Jayashree Kalpathy-Cramer
- Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Charlestown, Massachusetts
- Massachusetts General Hospital & Brigham and Women’s Hospital Center for Clinical Data Science, Boston, Massachusetts
| | - Michael F. Chiang
- National Eye Institute, National Institutes of Health, Bethesda, Maryland
| | - J. Peter Campbell
- Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, Portland, Oregon
- Correspondence: J. Peter Campbell, MD, MPH, Department of Ophthalmology, Casey Eye Institute, Oregon Health & Science University, 515 SW Campus Drive, Portland, OR 97239.
| |
Collapse
|
50
|
Chen Y, Long J, Guo J. RF-GANs: A Method to Synthesize Retinal Fundus Images Based on Generative Adversarial Network. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021; 2021:3812865. [PMID: 34804140 PMCID: PMC8598326 DOI: 10.1155/2021/3812865] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 10/03/2021] [Accepted: 10/23/2021] [Indexed: 11/17/2022]
Abstract
Diabetic retinopathy (DR) is a diabetic complication affecting the eyes, which is the main cause of blindness in young and middle-aged people. In order to speed up the diagnosis of DR, a mass of deep learning methods have been used for the detection of this disease, but they failed to attain excellent results due to unbalanced training data, i.e., the lack of DR fundus images. To address the problem of data imbalance, this paper proposes a method dubbed retinal fundus images generative adversarial networks (RF-GANs), which is based on generative adversarial network, to synthesize retinal fundus images. RF-GANs is composed of two generation models, RF-GAN1 and RF-GAN2. Firstly, RF-GAN1 is employed to translate retinal fundus images from source domain (the domain of semantic segmentation datasets) to target domain (the domain of EyePACS dataset connected to Kaggle (EyePACS)). Then, we train the semantic segmentation models with the translated images, and employ the trained models to extract the structural and lesion masks (hereafter, we refer to it as Masks) of EyePACS. Finally, we employ RF-GAN2 to synthesize retinal fundus images using the Masks and DR grading labels. This paper verifies the effectiveness of the method: RF-GAN1 can narrow down the domain gap between different datasets to improve the performance of the segmentation models. RF-GAN2 can synthesize realistic retinal fundus images. Adopting the synthesized images for data augmentation, the accuracy and quadratic weighted kappa of the state-of-the-art DR grading model on the testing set of EyePACS increase by 1.53% and 1.70%, respectively.
Collapse
Affiliation(s)
- Yu Chen
- Information and Computer Engineering College, Northeast Forestry University, Harbin, China
| | - Jun Long
- Information and Computer Engineering College, Northeast Forestry University, Harbin, China
| | - Jifeng Guo
- Information and Computer Engineering College, Northeast Forestry University, Harbin, China
| |
Collapse
|