1
|
Cap QH, Fukuda A. High-Quality Medical Image Generation from Free-hand Sketch. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2024; 2024:1-6. [PMID: 40039523 DOI: 10.1109/embc53108.2024.10781774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
Generating medical images from human-drawn free-hand sketches holds promise for various important medical imaging applications. Due to the extreme difficulty in collecting free-hand sketch data in the medical domain, most deep learning-based methods have been proposed to generate medical images from the synthesized sketches (e.g., edge maps or contours of segmentation masks from real images). However, these models often fail to generalize on the free-hand sketches, leading to unsatisfactory results. In this paper, we propose a practical free-hand sketch-to-image generation model called Sketch2MedI that learns to represent sketches in StyleGAN's latent space and generate medical images from it. Thanks to the ability to encode sketches into this meaningful representation space, Sketch2MedI only requires synthesized sketches for training, enabling a cost-effective learning process. Our Sketch2MedI demonstrates a robust generalization to free-hand sketches, resulting in high-quality and realistic medical image generations. Comparative evaluations of Sketch2MedI against the pix2pix, CycleGAN, UNIT, and U-GAT-IT models show superior performance in generating pharyngeal images, both quantitative and qualitative across various metrics.
Collapse
|
2
|
Kreitner L, Paetzold JC, Rauch N, Chen C, Hagag AM, Fayed AE, Sivaprasad S, Rausch S, Weichsel J, Menze BH, Harders M, Knier B, Rueckert D, Menten MJ. Synthetic Optical Coherence Tomography Angiographs for Detailed Retinal Vessel Segmentation Without Human Annotations. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2061-2073. [PMID: 38224512 DOI: 10.1109/tmi.2024.3354408] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/17/2024]
Abstract
Optical coherence tomography angiography (OCTA) is a non-invasive imaging modality that can acquire high-resolution volumes of the retinal vasculature and aid the diagnosis of ocular, neurological and cardiac diseases. Segmenting the visible blood vessels is a common first step when extracting quantitative biomarkers from these images. Classical segmentation algorithms based on thresholding are strongly affected by image artifacts and limited signal-to-noise ratio. The use of modern, deep learning-based segmentation methods has been inhibited by a lack of large datasets with detailed annotations of the blood vessels. To address this issue, recent work has employed transfer learning, where a segmentation network is trained on synthetic OCTA images and is then applied to real data. However, the previously proposed simulations fail to faithfully model the retinal vasculature and do not provide effective domain adaptation. Because of this, current methods are unable to fully segment the retinal vasculature, in particular the smallest capillaries. In this work, we present a lightweight simulation of the retinal vascular network based on space colonization for faster and more realistic OCTA synthesis. We then introduce three contrast adaptation pipelines to decrease the domain gap between real and artificial images. We demonstrate the superior segmentation performance of our approach in extensive quantitative and qualitative experiments on three public datasets that compare our method to traditional computer vision algorithms and supervised training using human annotations. Finally, we make our entire pipeline publicly available, including the source code, pretrained models, and a large dataset of synthetic OCTA images.
Collapse
|
3
|
Jeong JJ, Tariq A, Adejumo T, Trivedi H, Gichoya JW, Banerjee I. Systematic Review of Generative Adversarial Networks (GANs) for Medical Image Classification and Segmentation. J Digit Imaging 2022; 35:137-152. [PMID: 35022924 PMCID: PMC8921387 DOI: 10.1007/s10278-021-00556-w] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Revised: 11/23/2021] [Accepted: 11/26/2021] [Indexed: 11/28/2022] Open
Abstract
In recent years, generative adversarial networks (GANs) have gained tremendous popularity for various imaging related tasks such as artificial image generation to support AI training. GANs are especially useful for medical imaging-related tasks where training datasets are usually limited in size and heavily imbalanced against the diseased class. We present a systematic review, following the PRISMA guidelines, of recent GAN architectures used for medical image analysis to help the readers in making an informed decision before employing GANs in developing medical image classification and segmentation models. We have extracted 54 papers that highlight the capabilities and application of GANs in medical imaging from January 2015 to August 2020 and inclusion criteria for meta-analysis. Our results show four main architectures of GAN that are used for segmentation or classification in medical imaging. We provide a comprehensive overview of recent trends in the application of GANs in clinical diagnosis through medical image segmentation and classification and ultimately share experiences for task-based GAN implementations.
Collapse
Affiliation(s)
- Jiwoong J Jeong
- Department of Biomedical Informatics, Emory School of Medicine, Atlanta, USA.
| | - Amara Tariq
- Department of Biomedical Informatics, Emory School of Medicine, Atlanta, USA
| | | | - Hari Trivedi
- Department of Radiology, Emory School of Medicine, Atlanta, USA
| | - Judy W Gichoya
- Department of Radiology, Emory School of Medicine, Atlanta, USA
| | - Imon Banerjee
- Department of Biomedical Informatics, Emory School of Medicine, Atlanta, USA.,Department of Radiology, Emory School of Medicine, Atlanta, USA
| |
Collapse
|
4
|
Tomar D, Lortkipanidze M, Vray G, Bozorgtabar B, Thiran JP. Self-Attentive Spatial Adaptive Normalization for Cross-Modality Domain Adaptation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:2926-2938. [PMID: 33577450 DOI: 10.1109/tmi.2021.3059265] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Despite the successes of deep neural networks on many challenging vision tasks, they often fail to generalize to new test domains that are not distributed identically to the training data. The domain adaptation becomes more challenging for cross-modality medical data with a notable domain shift. Given that specific annotated imaging modalities may not be accessible nor complete. Our proposed solution is based on the cross-modality synthesis of medical images to reduce the costly annotation burden by radiologists and bridge the domain gap in radiological images. We present a novel approach for image-to-image translation in medical images, capable of supervised or unsupervised (unpaired image data) setups. Built upon adversarial training, we propose a learnable self-attentive spatial normalization of the deep convolutional generator network's intermediate activations. Unlike previous attention-based image-to-image translation approaches, which are either domain-specific or require distortion of the source domain's structures, we unearth the importance of the auxiliary semantic information to handle the geometric changes and preserve anatomical structures during image translation. We achieve superior results for cross-modality segmentation between unpaired MRI and CT data for multi-modality whole heart and multi-modal brain tumor MRI (T1/T2) datasets compared to the state-of-the-art methods. We also observe encouraging results in cross-modality conversion for paired MRI and CT images on a brain dataset. Furthermore, a detailed analysis of the cross-modality image translation, thorough ablation studies confirm our proposed method's efficacy.
Collapse
|
5
|
Chlap P, Min H, Vandenberg N, Dowling J, Holloway L, Haworth A. A review of medical image data augmentation techniques for deep learning applications. J Med Imaging Radiat Oncol 2021; 65:545-563. [PMID: 34145766 DOI: 10.1111/1754-9485.13261] [Citation(s) in RCA: 249] [Impact Index Per Article: 62.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2021] [Accepted: 05/23/2021] [Indexed: 12/21/2022]
Abstract
Research in artificial intelligence for radiology and radiotherapy has recently become increasingly reliant on the use of deep learning-based algorithms. While the performance of the models which these algorithms produce can significantly outperform more traditional machine learning methods, they do rely on larger datasets being available for training. To address this issue, data augmentation has become a popular method for increasing the size of a training dataset, particularly in fields where large datasets aren't typically available, which is often the case when working with medical images. Data augmentation aims to generate additional data which is used to train the model and has been shown to improve performance when validated on a separate unseen dataset. This approach has become commonplace so to help understand the types of data augmentation techniques used in state-of-the-art deep learning models, we conducted a systematic review of the literature where data augmentation was utilised on medical images (limited to CT and MRI) to train a deep learning model. Articles were categorised into basic, deformable, deep learning or other data augmentation techniques. As artificial intelligence models trained using augmented data make their way into the clinic, this review aims to give an insight to these techniques and confidence in the validity of the models produced.
Collapse
Affiliation(s)
- Phillip Chlap
- South Western Sydney Clinical School, University of New South Wales, Sydney, New South Wales, Australia.,Ingham Institute for Applied Medical Research, Sydney, New South Wales, Australia.,Liverpool and Macarthur Cancer Therapy Centre, Liverpool Hospital, Sydney, New South Wales, Australia
| | - Hang Min
- South Western Sydney Clinical School, University of New South Wales, Sydney, New South Wales, Australia.,Ingham Institute for Applied Medical Research, Sydney, New South Wales, Australia.,The Australian e-Health and Research Centre, CSIRO Health and Biosecurity, Brisbane, Queensland, Australia
| | - Nym Vandenberg
- Institute of Medical Physics, University of Sydney, Sydney, New South Wales, Australia
| | - Jason Dowling
- South Western Sydney Clinical School, University of New South Wales, Sydney, New South Wales, Australia.,The Australian e-Health and Research Centre, CSIRO Health and Biosecurity, Brisbane, Queensland, Australia
| | - Lois Holloway
- South Western Sydney Clinical School, University of New South Wales, Sydney, New South Wales, Australia.,Ingham Institute for Applied Medical Research, Sydney, New South Wales, Australia.,Liverpool and Macarthur Cancer Therapy Centre, Liverpool Hospital, Sydney, New South Wales, Australia.,Institute of Medical Physics, University of Sydney, Sydney, New South Wales, Australia.,Centre for Medical Radiation Physics, University of Wollongong, Wollongong, New South Wales, Australia
| | - Annette Haworth
- Institute of Medical Physics, University of Sydney, Sydney, New South Wales, Australia
| |
Collapse
|
6
|
Peng B, Liu B, Bin Y, Shen L, Lei J. Multi-Modality MR Image Synthesis via Confidence-Guided Aggregation and Cross-Modality Refinement. IEEE J Biomed Health Inform 2021; 26:27-35. [PMID: 34018939 DOI: 10.1109/jbhi.2021.3082541] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Magnetic resonance imaging (MRI) can provide multi-modality MR images by setting task-specific scan parameters, and has been widely used in various disease diagnosis and planned treatments. However, in practical clinical applications, it is often difficult to obtain multi-modality MR images simultaneously due to patient discomfort, and scanning costs, etc. Therefore, how to effectively utilize the existing modality images to synthesize missing modality image has become a hot research topic. In this paper, we propose a novel confidence-guided aggregation and cross-modality refinement network (CACR-Net) for multi-modality MR image synthesis, which effectively utilizes complementary and correlative information of multiple modalities to synthesize high-quality target-modality images. Specifically, to effectively utilize the complementary modality-specific characteristics, a confidence-guided aggregation module is proposed to adaptively aggregate the multiple target-modality images generated from multiple source-modality images by using the corresponding confidence maps. Based on the aggregated target-modality image, a cross-modality refinement module is presented to further refine the target-modality image by mining correlative information among the multiple source-modality images and aggregated target-modality image. By training the proposed CACR-Net in an end-to-end manner, high-quality and sharp target-modality MR images are effectively synthesized. Experimental results on the widely used benchmark demonstrate that the proposed method outperforms state-of-the-art methods.
Collapse
|
7
|
Liu J, Li J, Liu T, Tam J. Graded Image Generation Using Stratified CycleGAN. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION : MICCAI ... INTERNATIONAL CONFERENCE ON MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION 2020; 12262:760-769. [PMID: 33145588 PMCID: PMC7605896 DOI: 10.1007/978-3-030-59713-9_73] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
In medical imaging, CycleGAN has been used for various image generation tasks, including image synthesis, image denoising, and data augmentation. However, when pushing the technical limits of medical imaging, there can be a substantial variation in image quality. Here, we demonstrate that images generated by CycleGAN can be improved through explicit grading of image quality, which we call stratified CycleGAN. In this image generation task, CycleGAN is used to upgrade the image quality and content of near-infrared fluorescent (NIRF) retinal images. After manual assignment of grading scores to a small subset of the data, semi-supervised learning is applied to propagate grades across the remainder of the data and set up the training data. These scores are embedded into the CycleGAN by adding the grading score as a conditional input to the generator and by integrating an image quality classifier into the discriminator. We validate the efficacy of the proposed stratified CycleGAN by considering pairs of NIRF images at the same retinal regions (imaged with and without correction of optical aberrations achieved using adaptive optics), with the goal being to restore image quality in aberrated images such that cellular-level detail can be obtained. Overall, stratified CycleGAN generated higher quality synthetic images than traditional CycleGAN. Evaluation of cell detection accuracy confirmed that synthetic images were faithful to ground truth images of the same cells. Across this challenging dataset, F1-score improved from 76.9 ± 5.7% when using traditional CycleGAN to 85.0±3.4% when using stratified CycleGAN. These findings demonstrate the potential of stratified Cycle-GAN to improve the synthesis of medical images that exhibit a graded variation in image quality.
Collapse
|
8
|
Zhou T, Fu H, Chen G, Shen J, Shao L. Hi-Net: Hybrid-Fusion Network for Multi-Modal MR Image Synthesis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2020; 39:2772-2781. [PMID: 32086202 DOI: 10.1109/tmi.2020.2975344] [Citation(s) in RCA: 106] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Magnetic resonance imaging (MRI) is a widely used neuroimaging technique that can provide images of different contrasts (i.e., modalities). Fusing this multi-modal data has proven particularly effective for boosting model performance in many tasks. However, due to poor data quality and frequent patient dropout, collecting all modalities for every patient remains a challenge. Medical image synthesis has been proposed as an effective solution, where any missing modalities are synthesized from the existing ones. In this paper, we propose a novel Hybrid-fusion Network (Hi-Net) for multi-modal MR image synthesis, which learns a mapping from multi-modal source images (i.e., existing modalities) to target images (i.e., missing modalities). In our Hi-Net, a modality-specific network is utilized to learn representations for each individual modality, and a fusion network is employed to learn the common latent representation of multi-modal data. Then, a multi-modal synthesis network is designed to densely combine the latent representation with hierarchical features from each modality, acting as a generator to synthesize the target images. Moreover, a layer-wise multi-modal fusion strategy effectively exploits the correlations among multiple modalities, where a Mixed Fusion Block (MFB) is proposed to adaptively weight different fusion strategies. Extensive experiments demonstrate the proposed model outperforms other state-of-the-art medical image synthesis methods.
Collapse
|
9
|
Luo Y, Chen K, Liu L, Liu J, Mao J, Ke G, Sun M. Dehaze of Cataractous Retinal Images Using an Unpaired Generative Adversarial Network. IEEE J Biomed Health Inform 2020; 24:3374-3383. [PMID: 32750919 DOI: 10.1109/jbhi.2020.2999077] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Cataracts are the leading cause of visual impairment worldwide. Examination of the retina through cataracts using a fundus camera is challenging and error-prone due to degraded image quality. We sought to develop an algorithm to dehaze such images to support diagnosis by either ophthalmologists or computer-aided diagnosis systems. Based on the generative adversarial network (GAN) concept, we designed two neural networks: CataractSimGAN and CataractDehazeNet. CataractSimGAN was intended for the synthesis of cataract-like images through unpaired clear retinal images and cataract images. CataractDehazeNet was trained using pairs of synthesized cataract-like images and the corresponding clear images through supervised learning. With two networks trained independently, the number of hyper-parameters was reduced, leading to better performance. We collected 400 retinal images without cataracts and 400 hazy images from cataract patients as the training dataset. Fifty cataract images and the corresponding clear images from the same patients after surgery comprised the test dataset. The clear images after surgery were used for reference to evaluate the performance of our method. CataractDehazeNet was able to enhance the degraded image from cataract patients substantially and to visualize blood vessels and the optic disc, while actively suppressing the artifacts common in application of similar methods. Thus, we developed an algorithm to improve the quality of the retinal images acquired from cataract patients. We achieved high structure similarity and fidelity between processed images and images from the same patients after cataract surgery.
Collapse
|
10
|
Zunair H, Ben Hamza A. Melanoma detection using adversarial training and deep transfer learning. Phys Med Biol 2020; 65:135005. [PMID: 32252036 DOI: 10.1088/1361-6560/ab86d3] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Skin lesion datasets consist predominantly of normal samples with only a small percentage of abnormal ones, giving rise to the class imbalance problem. Also, skin lesion images are largely similar in overall appearance owing to the low inter-class variability. In this paper, we propose a two-stage framework for automatic classification of skin lesion images using adversarial training and transfer learning toward melanoma detection. In the first stage, we leverage the inter-class variation of the data distribution for the task of conditional image synthesis by learning the inter-class mapping and synthesizing under-represented class samples from the over-represented ones using unpaired image-to-image translation. In the second stage, we train a deep convolutional neural network for skin lesion classification using the original training set combined with the newly synthesized under-represented class samples. The training of this classifier is carried out by minimizing the focal loss function, which assists the model in learning from hard examples, while down-weighting the easy ones. Experiments conducted on a dermatology image benchmark demonstrate the superiority of our proposed approach over several standard baseline methods, achieving significant performance improvements. Interestingly, we show through feature visualization and analysis that our method leads to context based lesion assessment that can reach an expert dermatologist level.
Collapse
|
11
|
Mou L, Chen L, Cheng J, Gu Z, Zhao Y, Liu J. Dense Dilated Network With Probability Regularized Walk for Vessel Detection. IEEE TRANSACTIONS ON MEDICAL IMAGING 2020; 39:1392-1403. [PMID: 31675323 DOI: 10.1109/tmi.2019.2950051] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The detection of retinal vessel is of great importance in the diagnosis and treatment of many ocular diseases. Many methods have been proposed for vessel detection. However, most of the algorithms neglect the connectivity of the vessels, which plays an important role in the diagnosis. In this paper, we propose a novel method for retinal vessel detection. The proposed method includes a dense dilated network to get an initial detection of the vessels and a probability regularized walk algorithm to address the fracture issue in the initial detection. The dense dilated network integrates newly proposed dense dilated feature extraction blocks into an encoder-decoder structure to extract and accumulate features at different scales. A multi-scale Dice loss function is adopted to train the network. To improve the connectivity of the segmented vessels, we also introduce a probability regularized walk algorithm to connect the broken vessels. The proposed method has been applied on three public data sets: DRIVE, STARE and CHASE_DB1. The results show that the proposed method outperforms the state-of-the-art methods in accuracy, sensitivity, specificity and also area under receiver operating characteristic curve.
Collapse
|