1
|
Zoghby MM, Erickson BJ, Conte GM. Generative Adversarial Networks for Brain MRI Synthesis: Impact of Training Set Size on Clinical Application. J Imaging Inform Med 2024:10.1007/s10278-024-00976-4. [PMID: 38366293 DOI: 10.1007/s10278-024-00976-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 11/27/2023] [Accepted: 11/27/2023] [Indexed: 02/18/2024]
Abstract
We evaluated the impact of training set size on generative adversarial networks (GANs) to synthesize brain MRI sequences. We compared three sets of GANs trained to generate pre-contrast T1 (gT1) from post-contrast T1 and FLAIR (gFLAIR) from T2. The baseline models were trained on 135 cases; for this study, we used the same model architecture but a larger cohort of 1251 cases and two stopping rules, an early checkpoint (early models) and one after 50 epochs (late models). We tested all models on an independent dataset of 485 newly diagnosed gliomas. We compared the generated MRIs with the original ones using the structural similarity index (SSI) and mean squared error (MSE). We simulated scenarios where either the original T1, FLAIR, or both were missing and used their synthesized version as inputs for a segmentation model with the original post-contrast T1 and T2. We compared the segmentations using the dice similarity coefficient (DSC) for the contrast-enhancing area, non-enhancing area, and the whole lesion. For the baseline, early, and late models on the test set, for the gT1, median SSI was .957, .918, and .947; median MSE was .006, .014, and .008. For the gFLAIR, median SSI was .924, .908, and .915; median MSE was .016, .016, and .019. The range DSC was .625-.955, .420-.952, and .610-.954. Overall, GANs trained on a relatively small cohort performed similarly to those trained on a cohort ten times larger, making them a viable option for rare diseases or institutions with limited resources.
Collapse
Affiliation(s)
- M M Zoghby
- Department of Radiology, Mayo Clinic, Rochester, MN, USA
| | - B J Erickson
- Department of Radiology, Mayo Clinic, Rochester, MN, USA
| | - G M Conte
- Department of Radiology, Mayo Clinic, Rochester, MN, USA.
| |
Collapse
|
2
|
Shen K, Vivone G, Yang X, Lolli S, Schmitt M. A benchmarking protocol for SAR colorization: From regression to deep learning approaches. Neural Netw 2024; 169:698-712. [PMID: 37976594 DOI: 10.1016/j.neunet.2023.10.058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 10/02/2023] [Accepted: 10/31/2023] [Indexed: 11/19/2023]
Abstract
Synthetic aperture radar (SAR) images are widely used in remote sensing. Interpreting SAR images can be challenging due to their intrinsic speckle noise and grayscale nature. To address this issue, SAR colorization has emerged as a research direction to colorize gray scale SAR images while preserving the original spatial information and radiometric information. However, this research field is still in its early stages, and many limitations can be highlighted. In this paper, we propose a full research line for supervised learning-based approaches to SAR colorization. Our approach includes a protocol for generating synthetic color SAR images, several baselines, and an effective method based on the conditional generative adversarial network (cGAN) for SAR colorization. We also propose numerical assessment metrics for the problem at hand. To our knowledge, this is the first attempt to propose a research line for SAR colorization that includes a protocol, a benchmark, and a complete performance evaluation. Our extensive tests demonstrate the effectiveness of our proposed cGAN-based network for SAR colorization. The code is available at https://github.com/shenkqtx/SAR-Colorization-Benchmarking-Protocol.
Collapse
Affiliation(s)
- Kangqing Shen
- School of Mathematical Sciences, Beihang University, Beijing, 102206, China
| | - Gemine Vivone
- Institute of Methodologies for Environmental Analysis, CNR-IMAA, Tito Scalo, 85050, Italy; National Biodiversity Future Center, NBFC, Palermo, 90133, Italy
| | - Xiaoyuan Yang
- School of Mathematical Sciences, Beihang University, Beijing, 102206, China; Key Laboratory of Mathematics, Information and Behavior, Ministry of Education, Beihang University, Beijing, 102206, China.
| | - Simone Lolli
- Institute of Methodologies for Environmental Analysis, CNR-IMAA, Tito Scalo, 85050, Italy; CommSensLab, Department of Signal Theory and Communications, Polytechnic University of Catalonia, Barcelona, 08034, Spain
| | | |
Collapse
|
3
|
Sun Y, Gu Y, Shi F, Liu J, Li G, Feng Q, Shen D. Coarse-to-fine registration and time-intensity curves constraint for liver DCE-MRI synthesis. Comput Med Imaging Graph 2024; 111:102319. [PMID: 38147798 DOI: 10.1016/j.compmedimag.2023.102319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 11/03/2023] [Accepted: 12/06/2023] [Indexed: 12/28/2023]
Abstract
Image registration plays a crucial role in dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI), used as a fundamental step for the subsequent diagnosis of benign and malignant tumors. However, the registration process encounters significant challenges due to the substantial intensity changes observed among different time points, resulting from the injection of contrast agents. Furthermore, previous studies have often overlooked the alignment of small structures, such as tumors and vessels. In this work, we propose a novel DCE-MRI registration framework that can effectively align the DCE-MRI time series. Specifically, our DCE-MRI registration framework consists of two steps, i.e., a de-enhancement synthesis step and a coarse-to-fine registration step. In the de-enhancement synthesis step, a disentanglement network separates DCE-MRI images into a content component representing the anatomical structures and a style component indicating the presence or absence of contrast agents. This step generates synthetic images where the contrast agents are removed from the original images, alleviating the negative effects of intensity changes on the subsequent registration process. In the registration step, we utilize a coarse registration network followed by a refined registration network. These two networks facilitate the estimation of both the coarse and refined displacement vector fields (DVFs) in a pairwise and groupwise registration manner, respectively. In addition, to enhance the alignment accuracy for small structures, a voxel-wise constraint is further conducted by assessing the smoothness of the time-intensity curves (TICs). Experimental results on liver DCE-MRI demonstrate that our proposed method outperforms state-of-the-art approaches, offering more robust and accurate alignment results.
Collapse
Affiliation(s)
- Yuhang Sun
- School of Biomedical Engineering, Southern Medical University, Guangzhou, China; School of Biomedical Engineering & State Key Laboratory of Advanced Medical Materials and Devices, ShanghaiTech University, Shanghai, China
| | - Yuning Gu
- School of Biomedical Engineering & State Key Laboratory of Advanced Medical Materials and Devices, ShanghaiTech University, Shanghai, China
| | - Feng Shi
- Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
| | - Jiameng Liu
- School of Biomedical Engineering & State Key Laboratory of Advanced Medical Materials and Devices, ShanghaiTech University, Shanghai, China
| | - Guoqiang Li
- School of Biomedical Engineering & State Key Laboratory of Advanced Medical Materials and Devices, ShanghaiTech University, Shanghai, China
| | - Qianjin Feng
- School of Biomedical Engineering, Southern Medical University, Guangzhou, China.
| | - Dinggang Shen
- School of Biomedical Engineering & State Key Laboratory of Advanced Medical Materials and Devices, ShanghaiTech University, Shanghai, China; Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China; Shanghai Clinical Research and Trial Center, Shanghai, China.
| |
Collapse
|
4
|
Shi D, Zhang W, He S, Chen Y, Song F, Liu S, Wang R, Zheng Y, He M. Translation of Color Fundus Photography into Fluorescein Angiography Using Deep Learning for Enhanced Diabetic Retinopathy Screening. Ophthalmol Sci 2023; 3:100401. [PMID: 38025160 PMCID: PMC10630672 DOI: 10.1016/j.xops.2023.100401] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 08/23/2023] [Accepted: 09/08/2023] [Indexed: 12/01/2023]
Abstract
Purpose To develop and validate a deep learning model that can transform color fundus (CF) photography into corresponding venous and late-phase fundus fluorescein angiography (FFA) images. Design Cross-sectional study. Participants We included 51 370 CF-venous FFA pairs and 14 644 CF-late FFA pairs from 4438 patients for model development. External testing involved 50 eyes with CF-FFA pairs and 2 public datasets for diabetic retinopathy (DR) classification, with 86 952 CF from EyePACs, and 1744 CF from MESSIDOR2. Methods We trained a deep-learning model to transform CF into corresponding venous and late-phase FFA images. The translated FFA images' quality was evaluated quantitatively on the internal test set and subjectively on 100 eyes with CF-FFA paired images (50 from external), based on the realisticity of the global image, anatomical landmarks (macula, optic disc, and vessels), and lesions. Moreover, we validated the clinical utility of the translated FFA for classifying 5-class DR and diabetic macular edema (DME) in the EyePACs and MESSIDOR2 datasets. Main Outcome Measures Image generation was quantitatively assessed by structural similarity measures (SSIM), and subjectively by 2 clinical experts on a 5-point scale (1 refers real FFA); intragrader agreement was assessed by kappa. The DR classification accuracy was assessed by area under the receiver operating characteristic curve. Results The SSIM of the translated FFA images were > 0.6, and the subjective quality scores ranged from 1.37 to 2.60. Both experts reported similar quality scores with substantial agreement (all kappas > 0.8). Adding the generated FFA on top of CF improved DR classification in the EyePACs and MESSIDOR2 datasets, with the area under the receiver operating characteristic curve increased from 0.912 to 0.939 on the EyePACs dataset and from 0.952 to 0.972 on the MESSIDOR2 dataset. The DME area under the receiver operating characteristic curve also increased from 0.927 to 0.974 in the MESSIDOR2 dataset. Conclusions Our CF-to-FFA framework produced realistic FFA images. Moreover, adding the translated FFA images on top of CF improved the accuracy of DR screening. These results suggest that CF-to-FFA translation could be used as a surrogate method when FFA examination is not feasible and as a simple add-on to improve DR screening. Financial Disclosures Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
Collapse
Affiliation(s)
- Danli Shi
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong
- Research Centre for SHARP Vision (RCSV), The Hong Kong Polytechnic University, Kowloon, Hong Kong
| | - Weiyi Zhang
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong
| | - Shuang He
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Guangdong Provincial Clinical Research Center for Ocular Diseases, Sun Yat-sen University, Guangzhou, China
| | - Yanxian Chen
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong
| | - Fan Song
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong
| | - Shunming Liu
- Department of Ophthalmology, Guangdong Academy of Medical Sciences, Guangdong Provincial People's Hospital, Guangzhou, China
| | - Ruobing Wang
- Department of Ophthalmology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Yingfeng Zheng
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Guangdong Provincial Clinical Research Center for Ocular Diseases, Sun Yat-sen University, Guangzhou, China
| | - Mingguang He
- School of Optometry, The Hong Kong Polytechnic University, Kowloon, Hong Kong
- Research Centre for SHARP Vision (RCSV), The Hong Kong Polytechnic University, Kowloon, Hong Kong
- Department of Ophthalmology, Guangdong Academy of Medical Sciences, Guangdong Provincial People's Hospital, Guangzhou, China
| |
Collapse
|
5
|
Honkamaa J, Khan U, Koivukoski S, Valkonen M, Latonen L, Ruusuvuori P, Marttinen P. Deformation equivariant cross-modality image synthesis with paired non-aligned training data. Med Image Anal 2023; 90:102940. [PMID: 37666115 DOI: 10.1016/j.media.2023.102940] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Revised: 08/14/2023] [Accepted: 08/18/2023] [Indexed: 09/06/2023]
Abstract
Cross-modality image synthesis is an active research topic with multiple medical clinically relevant applications. Recently, methods allowing training with paired but misaligned data have started to emerge. However, no robust and well-performing methods applicable to a wide range of real world data sets exist. In this work, we propose a generic solution to the problem of cross-modality image synthesis with paired but non-aligned data by introducing new deformation equivariance encouraging loss functions. The method consists of joint training of an image synthesis network together with separate registration networks and allows adversarial training conditioned on the input even with misaligned data. The work lowers the bar for new clinical applications by allowing effortless training of cross-modality image synthesis networks for more difficult data sets.
Collapse
Affiliation(s)
- Joel Honkamaa
- Department of Computer Science, Aalto University, Finland.
| | - Umair Khan
- Institute of Biomedicine, University of Turku, Finland
| | - Sonja Koivukoski
- Institute of Biomedicine, University of Eastern Finland, Kuopio, Finland
| | - Mira Valkonen
- Faculty of Medicine and Health Technology, Tampere University, Finland
| | - Leena Latonen
- Institute of Biomedicine, University of Eastern Finland, Kuopio, Finland
| | - Pekka Ruusuvuori
- Institute of Biomedicine, University of Turku, Finland; Faculty of Medicine and Health Technology, Tampere University, Finland
| | | |
Collapse
|
6
|
Khan U, Yasin A. Plane invariant segmentation of computed tomography images through weighted cross entropy optimized conditional GANs in compressed formats. Med Biol Eng Comput 2023; 61:2677-2697. [PMID: 37428300 DOI: 10.1007/s11517-023-02846-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Accepted: 05/12/2023] [Indexed: 07/11/2023]
Abstract
Computed tomography (CT) scan provides first-hand knowledge to doctors to identify an ailment. Deep neural networks help enhance image understanding through segmentation and labeling. In this work, we implement two variants of Pix2Pix generative adversarial networks (GANs) with varying complexities of generator and discriminator networks for plane invariant segmentation of CT scan images and subsequently propose an effective generative adversarial network with a suitably weighted binary cross-entropy loss function followed by image processing layer necessary for getting high-quality output segmentation. Our conditional GAN is powered by a unique set of an encoder-decoder network that coupled with the image processing layer produces enhanced segmentation. The network can be extended to the complete set of Hounsfield units and can also be implemented on smartphones. Furthermore, we also demonstrate effects on accuracy, F-1 score, and Jaccard index by using the conditional GAN networks on the spine vertebrae dataset, thus achieving an average of 86.28 % accuracy, 90.5 % Jaccard index score, and 89.9 % F-1 score in predicting segmented maps for validation input images. In addition, an overall lifting of accuracy, F-1 score, and Jaccard index graph for validation images with better continuity has also been highlighted.
Collapse
Affiliation(s)
- Usman Khan
- SS-CASE-IT Islamabad, Islamabad, Pakistan.
| | | |
Collapse
|
7
|
Huang LC, Tsai HH. Perceptual Contrastive Generative Adversarial Network based on image warping for unsupervised image-to-image translation. Neural Netw 2023; 166:313-325. [PMID: 37541163 DOI: 10.1016/j.neunet.2023.07.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Revised: 06/09/2023] [Accepted: 07/08/2023] [Indexed: 08/06/2023]
Abstract
This paper proposes an unsupervised image-to-image (UI2I) translation model, called Perceptual Contrastive Generative Adversarial Network (PCGAN), which can mitigate the distortion problem to enhance performance of the traditional UI2I methods. The PCGAN is designed with a two-stage UI2I model. In the first stage of the PCGAN, it leverages a novel image warping to transform shapes of objects in input (source) images. In the second stage of the PCGAN, the residual prediction is devised in refinements of the outputs of the first stage of the PCGAN. To promote performance of the image warping, a loss function, called Perceptual Patch-Wise InfoNCE, is developed in the PCGAN to effectively memorize the visual correspondences between warped images and refined images. Experimental results on quantitative evaluation and visualization comparison for UI2I benchmarks show that the PCGAN is superior to other existing methods considered here.
Collapse
Affiliation(s)
- Lin-Chieh Huang
- Institute of Data Science & Information Computing, National Chung Hsing University, 402, Taichung, Taiwan
| | - Hung-Hsu Tsai
- Institute of Data Science & Information Computing, National Chung Hsing University, 402, Taichung, Taiwan.
| |
Collapse
|
8
|
Rajabi MM, Komeilian P, Wan X, Farmani R. Leak detection and localization in water distribution networks using conditional deep convolutional generative adversarial networks. Water Res 2023; 238:120012. [PMID: 37150062 DOI: 10.1016/j.watres.2023.120012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2023] [Revised: 04/06/2023] [Accepted: 04/26/2023] [Indexed: 05/09/2023]
Abstract
This paper explores the use of 'conditional convolutional generative adversarial networks' (CDCGAN) for image-based leak detection and localization (LD&L) in water distribution networks (WDNs). The method employs pressure measurements and is based on four pillars: (1) hydraulic model-based generation of leak-free training data by taking into account the demand uncertainty, (2) conversion of hydraulic model input demand-output pressure pairs into images using kriging interpolation, (3) training of a CDCGAN model for image-to-image translation, and (4) using the structural similarity (SSIM) index for LD&L. SSIM, computed over the entire pressure distribution image is used for leak detection, and a local estimate of SSIM is employed for leak localization. The CDCGAN model employed in this paper is based on the pix2pix architecture. The effectiveness of the proposed methodology is demonstrated on leakage datasets under various scenarios. Results show that the method has an accuracy of approximately 70% for real-time leak detection. The proposed method is well-suited for real-time applications due to the low computational cost of CDCGAN predictions compared to WDN hydraulic models, is robust in presence of uncertainty due to the nature of generative adversarial networks, and scales well to large and variable-sized monitoring data due to the use of an image-based approach.
Collapse
Affiliation(s)
- Mohammad Mahdi Rajabi
- Civil and Environmental Engineering Faculty, Tarbiat Modares University, PO Box 14115-397, Tehran, Iran.
| | - Pooya Komeilian
- Department of Civil Engineering, Sharif University of Technology, Tehran, Iran
| | - Xi Wan
- Centre for Water Systems, Department of Engineering, University of Exeter, Exeter, Devon EX4 4QF, UK
| | - Raziyeh Farmani
- Centre for Water Systems, Department of Engineering, University of Exeter, Exeter, Devon EX4 4QF, UK
| |
Collapse
|
9
|
Park KS, Moon JB, Cho SG, Kim J, Song HC. Applying Pix2pix to Translate Hyperemia in Blood Pool Image into Corresponding Increased Bone Uptake in Delayed Image in Three-Phase Bone Scintigraphy. Nucl Med Mol Imaging 2023; 57:103-109. [PMID: 36998587 PMCID: PMC10043061 DOI: 10.1007/s13139-022-00786-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 11/24/2022] [Accepted: 12/15/2022] [Indexed: 01/04/2023] Open
Abstract
Purpose Delayed images may not be acquired due to severe pain, drowsiness, or worsening vital signs while waiting after blood pool imaging in three-phase bone scintigraphy. If the hyperemia in the blood pool image contains information from which increased uptake on the delayed images can be inferred, the generative adversarial network (GAN) can generate the increased uptake from the hyperemia. We attempted to apply pix2pix, a type of conditional GAN, to transform hyperemia into increased bone uptake. Methods We enrolled 1464 patients who underwent three-phase bone scintigraphy for inflammatory arthritis, osteomyelitis, complex regional pain syndrome (CRPS), cellulitis, and recent bone injury. Blood pool images were acquired 10 min after intravenous injection of Tc-99 m hydroxymethylene diphosphonate, and delayed bone images were obtained after 3 h. The model was based on the open-source code of the pix2pix model with perceptual loss. Increased uptake in the delayed images generated by the model was evaluated using lesion-based analysis by a nuclear radiologist in areas consistent with hyperemia in the blood pool images. Results The model showed sensitivities of 77.8% and 87.5% for inflammatory arthritis and CRPS, respectively. In osteomyelitis and cellulitis, their sensitivities of about 44% were observed. However, in cases of recent bone injury, the sensitivity was only 6.3% in areas consistent with focal hyperemia. Conclusion The model based on pix2pix generated increased uptake in delayed images matching the hyperemia in the blood pool image in inflammatory arthritis and CRPS.
Collapse
Affiliation(s)
- Ki Seong Park
- Department of Nuclear Medicine, Chonnam National University Hospital, 42 Jaebong-Ro, Dong-Gu, Gwangju, 61469 Republic of Korea
| | - Jang Bae Moon
- Department of Nuclear Medicine, Chonnam National University Hospital, 42 Jaebong-Ro, Dong-Gu, Gwangju, 61469 Republic of Korea
| | - Sang-Geon Cho
- Department of Nuclear Medicine, Chonnam National University Hospital, 42 Jaebong-Ro, Dong-Gu, Gwangju, 61469 Republic of Korea
| | - Jahae Kim
- Department of Nuclear Medicine, Chonnam National University Hospital, 42 Jaebong-Ro, Dong-Gu, Gwangju, 61469 Republic of Korea
- Department of Nuclear Medicine, Chonnam National University Medical School, 160 Baekseo-ro, Dong-Gu, Gwangju, 61469 Republic of Korea
- Department of Artificial Intelligence Convergence, Chonnam National University, Gwangju, Republic of Korea
| | - Ho-Chun Song
- Department of Nuclear Medicine, Chonnam National University Hospital, 42 Jaebong-Ro, Dong-Gu, Gwangju, 61469 Republic of Korea
- Department of Nuclear Medicine, Chonnam National University Medical School, 160 Baekseo-ro, Dong-Gu, Gwangju, 61469 Republic of Korea
| |
Collapse
|
10
|
Ko K, Yeom T, Lee M. SuperstarGAN: Generative adversarial networks for image-to-image translation in large-scale domains. Neural Netw 2023; 162:330-339. [PMID: 36940493 DOI: 10.1016/j.neunet.2023.02.042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 01/19/2023] [Accepted: 02/28/2023] [Indexed: 03/09/2023]
Abstract
Image-to-image translation with generative adversarial networks (GANs) has been extensively studied in recent years. Among the models, StarGAN has achieved image-to-image translation for multiple domains with a single generator, whereas conventional models require multiple generators. However, StarGAN has several limitations, including the lack of capacity to learn mappings among large-scale domains; furthermore, StarGAN can barely express small feature changes. To address the limitations, we propose an improved StarGAN, namely SuperstarGAN. We adopted the idea, first proposed in controllable GAN (ControlGAN), of training an independent classifier with the data augmentation techniques to handle the overfitting problem in the classification of StarGAN structures. Since the generator with a well-trained classifier can express small features belonging to the target domain, SuperstarGAN achieves image-to-image translation in large-scale domains. Evaluated with a face image dataset, SuperstarGAN demonstrated improved performance in terms of Fréchet Inception distance (FID) and learned perceptual image patch similarity (LPIPS). Specifically, compared to StarGAN, SuperstarGAN exhibited decreased FID and LPIPS by 18.1% and 42.5%, respectively. Furthermore, we conducted an additional experiment with interpolated and extrapolated label values, indicating the ability of SuperstarGAN to control the degree of expression of the target domain features in generated images. Additionally, SuperstarGAN was successfully adapted to an animal face dataset and a painting dataset, where it can translate styles of animal faces (i.e., a cat to a tiger) and styles of painters (i.e., Hassam to Picasso), respectively, which explains the generality of SuperstarGAN regardless of datasets.
Collapse
Affiliation(s)
- Kanghyeok Ko
- School of Electrical and Electronics Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul 06974, South Korea
| | - Taesun Yeom
- School of Mechanical Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul 06974, South Korea
| | - Minhyeok Lee
- School of Electrical and Electronics Engineering, Chung-Ang University, 84 Heukseok-ro, Dongjak-gu, Seoul 06974, South Korea.
| |
Collapse
|
11
|
Cai N, Chen H, Li Y, Peng Y, Guo L. Registration on DCE-MRI images via multi-domain image-to-image translation. Comput Med Imaging Graph 2023; 104:102169. [PMID: 36586196 DOI: 10.1016/j.compmedimag.2022.102169] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 12/24/2022] [Accepted: 12/24/2022] [Indexed: 12/29/2022]
Abstract
Registration of dynamic contrast enhanced magnetic resonance imaging (DCE-MRI) is challenging as rapid intensity changes caused by a contrast agent lead to large registration errors. To address this problem, we propose a novel multi-domain image-to-image translation (MDIT) network based on image disentangling for separating motion from contrast changes before registration. In particular, the DCE images are disentangled into a domain-invariant content space (motion) and a domain-specific attribute space (contrast changes). The disentangled representations are then used to generate images, where the contrast changes have been removed from the motion. After that the resulting deformations can be directly derived from the generated images using an FFD registration. The method is tested on 10 lung DCE-MRI cases. The proposed method reaches an average root mean squared error of 0.3 ± 0.41 and the separation time is about 2.4 s for each case. Results show that the proposed method improves the registration efficiency without losing the registration accuracy compared with several state-of-the-art registration methods.
Collapse
|
12
|
Jeong J, Kim KD, Nam Y, Cho CE, Go H, Kim N. Stain normalization using score-based diffusion model through stain separation and overlapped moving window patch strategies. Comput Biol Med 2023; 152:106335. [PMID: 36473344 DOI: 10.1016/j.compbiomed.2022.106335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 11/10/2022] [Accepted: 11/15/2022] [Indexed: 11/30/2022]
Abstract
Hematoxylin and eosin (H&E) staining is the gold standard modality for diagnosis in medicine. However, the dosage ratio of hematoxylin to eosin in H&E staining has not been standardized yet. Additionally, H&E stains fade out at various speeds. Therefore, the staining quality could differ among each image, and stain normalization is a critical preprocessing approach for training deep learning (DL) models, especially in long-term and/or multicenter digital pathology studies. However, conventional methods for stain normalization have some significant drawbacks, such as collapsing in the structure and/or texture of tissue. In addition, conventional methods must require a reference patch or slide. Meanwhile, DL-based methods have a risk of overfitting and/or grid artifacts. We developed a score-based diffusion model of colorization for stain normalization. However, mistransfer, in which the model confuses hematoxylin with eosin, can occur using a score-based diffusion model due to its high diversity nature. To overcome this mistransfer, we propose a stain separation method using sparse non-negative matrix factorization (SNMF), which can decompose pathology slide into Hematoxylin and Eosin to normalize each stain component. Furthermore, inpainting with overlapped moving window patches was used to prevent grid artifacts of whole slide image normalization. Our method can normalize the whole slide pathology images through this stain normalization pipeline with decent performance.
Collapse
Affiliation(s)
- Jiheon Jeong
- Department of Biomedical Engineering, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, College of Medicine, University of Ulsan, Seoul, Republic of Korea; Department of Convergence Medicine, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Republic of Korea.
| | - Ki Duk Kim
- Department of Convergence Medicine, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Republic of Korea.
| | - Yujin Nam
- Department of Biomedical Engineering, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, College of Medicine, University of Ulsan, Seoul, Republic of Korea; Department of Convergence Medicine, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Republic of Korea
| | - Cristina Eunbee Cho
- Department of Convergence Medicine, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Republic of Korea
| | - Heounjeong Go
- Department of Pathology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea.
| | - Namkug Kim
- Department of Convergence Medicine, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Republic of Korea; Department of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea.
| |
Collapse
|
13
|
Vasiljević J, Nisar Z, Feuerhake F, Wemmert C, Lampert T. CycleGAN for virtual stain transfer: Is seeing really believing? Artif Intell Med 2022; 133:102420. [PMID: 36328671 DOI: 10.1016/j.artmed.2022.102420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Revised: 03/16/2022] [Accepted: 10/02/2022] [Indexed: 01/18/2023]
Abstract
Digital Pathology is an area prone to high variation due to multiple factors which can strongly affect diagnostic quality and visual appearance of the Whole-Slide-Images (WSIs). The state-of-the art methods to deal with such variation tend to address this through style-transfer inspired approaches. Usually, these solutions directly apply successful approaches from the literature, potentially with some task-related modifications. The majority of the obtained results are visually convincing, however, this paper shows that this is not a guarantee that such images can be directly used for either medical diagnosis or reducing domain shift.This article shows that slight modification in a stain transfer architecture, such as a choice of normalisation layer, while resulting in a variety of visually appealing results, surprisingly greatly effects the ability of a stain transfer model to reduce domain shift. By extensive qualitative and quantitative evaluations, we confirm that translations resulting from different stain transfer architectures are distinct from each other and from the real samples. Therefore conclusions made by visual inspection or pretrained model evaluation might be misleading.
Collapse
Affiliation(s)
- Jelica Vasiljević
- ICube, University of Strasbourg, CNRS (UMR 7357), France; University of Belgrade, Belgrade, Serbia; Faculty of Science, University of Kragujevac, Kragujevac, Serbia.
| | - Zeeshan Nisar
- ICube, University of Strasbourg, CNRS (UMR 7357), France
| | - Friedrich Feuerhake
- Institute of Pathology, Hannover Medical School, Germany; University Clinic, Freiburg, Germany
| | - Cédric Wemmert
- ICube, University of Strasbourg, CNRS (UMR 7357), France
| | - Thomas Lampert
- ICube, University of Strasbourg, CNRS (UMR 7357), France
| |
Collapse
|
14
|
Zuo L, Dewey BE, Liu Y, He Y, Newsome SD, Mowry EM, Resnick SM, Prince JL, Carass A. Unsupervised MR harmonization by learning disentangled representations using information bottleneck theory. Neuroimage 2021; 243:118569. [PMID: 34506916 PMCID: PMC10473284 DOI: 10.1016/j.neuroimage.2021.118569] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 08/11/2021] [Accepted: 09/07/2021] [Indexed: 01/19/2023] Open
Abstract
In magnetic resonance (MR) imaging, a lack of standardization in acquisition often causes pulse sequence-based contrast variations in MR images from site to site, which impedes consistent measurements in automatic analyses. In this paper, we propose an unsupervised MR image harmonization approach, CALAMITI (Contrast Anatomy Learning and Analysis for MR Intensity Translation and Integration), which aims to alleviate contrast variations in multi-site MR imaging. Designed using information bottleneck theory, CALAMITI learns a globally disentangled latent space containing both anatomical and contrast information, which permits harmonization. In contrast to supervised harmonization methods, our approach does not need a sample population to be imaged across sites. Unlike traditional unsupervised harmonization approaches which often suffer from geometry shifts, CALAMITI better preserves anatomy by design. The proposed method is also able to adapt to a new testing site with a straightforward fine-tuning process. Experiments on MR images acquired from ten sites show that CALAMITI achieves superior performance compared with other harmonization approaches.
Collapse
Affiliation(s)
- Lianrui Zuo
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD 21218 USA; Laboratory of Behavioral Neuroscience, National Institute on Aging, National Institute of Health, Baltimore, MD 20892, USA.
| | - Blake E Dewey
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD 21218 USA
| | - Yihao Liu
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD 21218 USA
| | - Yufan He
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD 21218 USA
| | - Scott D Newsome
- Department of Neurology, The Johns Hopkins School of Medicine, Baltimore, MD 21287, USA
| | - Ellen M Mowry
- Department of Neurology, The Johns Hopkins School of Medicine, Baltimore, MD 21287, USA
| | - Susan M Resnick
- Laboratory of Behavioral Neuroscience, National Institute on Aging, National Institute of Health, Baltimore, MD 20892, USA
| | - Jerry L Prince
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD 21218 USA
| | - Aaron Carass
- Department of Electrical and Computer Engineering, The Johns Hopkins University, Baltimore, MD 21218 USA
| |
Collapse
|
15
|
Sudarshan VP, Upadhyay U, Egan GF, Chen Z, Awate SP. Towards lower-dose PET using physics-based uncertainty-aware multimodal learning with robustness to out-of-distribution data. Med Image Anal 2021; 73:102187. [PMID: 34348196 DOI: 10.1016/j.media.2021.102187] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 07/12/2021] [Accepted: 07/16/2021] [Indexed: 10/20/2022]
Abstract
Radiation exposure in positron emission tomography (PET) imaging limits its usage in the studies of radiation-sensitive populations, e.g., pregnant women, children, and adults that require longitudinal imaging. Reducing the PET radiotracer dose or acquisition time reduces photon counts, which can deteriorate image quality. Recent deep-neural-network (DNN) based methods for image-to-image translation enable the mapping of low-quality PET images (acquired using substantially reduced dose), coupled with the associated magnetic resonance imaging (MRI) images, to high-quality PET images. However, such DNN methods focus on applications involving test data that match the statistical characteristics of the training data very closely and give little attention to evaluating the performance of these DNNs on new out-of-distribution (OOD) acquisitions. We propose a novel DNN formulation that models the (i) underlying sinogram-based physics of the PET imaging system and (ii) the uncertainty in the DNN output through the per-voxel heteroscedasticity of the residuals between the predicted and the high-quality reference images. Our sinogram-based uncertainty-aware DNN framework, namely, suDNN, estimates a standard-dose PET image using multimodal input in the form of (i) a low-dose/low-count PET image and (ii) the corresponding multi-contrast MRI images, leading to improved robustness of suDNN to OOD acquisitions. Results on in vivo simultaneous PET-MRI, and various forms of OOD data in PET-MRI, show the benefits of suDNN over the current state of the art, quantitatively and qualitatively.
Collapse
Affiliation(s)
- Viswanath P Sudarshan
- Computer Science and Engineering (CSE) Department, Indian Institute of Technology (IIT) Bombay, Mumbai, India; IITB-Monash Research Academy, Indian Institute of Technology (IIT) Bombay, Mumbai, India
| | - Uddeshya Upadhyay
- Computer Science and Engineering (CSE) Department, Indian Institute of Technology (IIT) Bombay, Mumbai, India
| | - Gary F Egan
- Monash Biomedical Imaging (MBI), Monash University, Melbourne, Australia
| | - Zhaolin Chen
- Monash Biomedical Imaging (MBI), Monash University, Melbourne, Australia
| | - Suyash P Awate
- Computer Science and Engineering (CSE) Department, Indian Institute of Technology (IIT) Bombay, Mumbai, India.
| |
Collapse
|
16
|
Zheng Z, Yu Z, Wu Y, Zheng H, Zheng B, Lee M. Generative Adversarial Network with Multi-branch Discriminator for imbalanced cross-species image-to-image translation. Neural Netw 2021; 141:355-371. [PMID: 33962124 DOI: 10.1016/j.neunet.2021.04.013] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Revised: 02/24/2021] [Accepted: 04/14/2021] [Indexed: 12/30/2022]
Abstract
There has been an increased interest in high-level image-to-image translation to achieve semantic matching. Through a powerful translation model, we can efficiently synthesize high-quality images with diverse appearances while retaining semantic matching. In this paper, we address an imbalanced learning problem using a cross-species image-to-image translation. We aim to perform the data augmentation through the image translation to boost the recognition performance of imbalanced learning. It requires a strong ability of the model to perform a biomorphic transformation on a semantic level. To tackle this problem, we propose a novel, simple, and effective structure of Multi-Branch Discriminator (termed as MBD) based on Generative Adversarial Networks (GANs). We demonstrate the effectiveness of the proposed MBD through theoretical analysis as well as empirical evaluation. We provide theoretical proof of why the proposed MBD is an effective and optimal case to achieve remarkable performance. Comprehensive experiments on various cross-species image translation tasks illustrate that our MBD can dramatically promote the performance of popular GANs with state-of-the-art results in terms of both objective and subjective assessments. Extensive downstream image recognition evaluations at a few-shot setting have also been conducted to demonstrate that the proposed method can effectively boost the performance of imbalanced learning.
Collapse
Affiliation(s)
- Ziqiang Zheng
- College of Information Science and Engineering / Sanya Oceanographic Institution, Ocean University of China, Qingdao / Sanya, China
| | - Zhibin Yu
- College of Information Science and Engineering / Sanya Oceanographic Institution, Ocean University of China, Qingdao / Sanya, China
| | - Yang Wu
- Kyoto University, Kyoto, Japan
| | - Haiyong Zheng
- College of Information Science and Engineering, Ocean University of China, Qingdao, China.
| | - Bing Zheng
- College of Information Science and Engineering / Sanya Oceanographic Institution, Ocean University of China, Qingdao / Sanya, China
| | - Minho Lee
- Kynpook National University, Daegu, Republic of Korea.
| |
Collapse
|
17
|
Wang CJ, Rost NS, Golland P. Spatial-Intensity Transform GANs for High Fidelity Medical Image-to-Image Translation. Med Image Comput Comput Assist Interv 2020; 12262:749-759. [PMID: 33615318 PMCID: PMC7888153 DOI: 10.1007/978-3-030-59713-9_72] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/11/2023]
Abstract
Despite recent progress in image-to-image translation, it remains challenging to apply such techniques to clinical quality medical images. We develop a novel parameterization of conditional generative adversarial networks that achieves high image fidelity when trained to transform MRIs conditioned on a patient's age and disease severity. The spatial-intensity transform generative adversarial network (SIT-GAN) constrains the generator to a smooth spatial transform composed with sparse intensity changes. This technique improves image quality and robustness to artifacts, and generalizes to different scanners. We demonstrate SIT-GAN on a large clinical image dataset of stroke patients, where it captures associations between ventricle expansion and aging, as well as between white matter hyperintensities and stroke severity. Additionally, SIT-GAN provides a disentangled view of the variation in shape and appearance across subjects.
Collapse
Affiliation(s)
- Clinton J Wang
- Computer Science and Artificial Intelligence Lab, MIT, Cambridge, MA, USA
| | - Natalia S Rost
- Department of Neurology, Massachusetts General Hospital, Boston, MA, USA
| | - Polina Golland
- Computer Science and Artificial Intelligence Lab, MIT, Cambridge, MA, USA
| |
Collapse
|
18
|
Maspero M, Houweling AC, Savenije MHF, van Heijst TCF, Verhoeff JJC, Kotte ANTJ, van den Berg CAT. A single neural network for cone-beam computed tomography-based radiotherapy of head-and-neck, lung and breast cancer. Phys Imaging Radiat Oncol 2020; 14:24-31. [PMID: 33458310 PMCID: PMC7807541 DOI: 10.1016/j.phro.2020.04.002] [Citation(s) in RCA: 56] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Revised: 04/24/2020] [Accepted: 04/29/2020] [Indexed: 01/28/2023]
Abstract
A deep learning network facilitated dose calculation from CBCT. A single network achieved CBCT-based dose calculation generating synthetic CT for head-and-neck, lung, and breast cancer patients with similar performance to a network specifically trained for each anatomical site. Generation of synthetic-CT can be achieved within 10 s, facilitating online adaptive radiotherapy scenarios.
Background and purpose Adaptive radiotherapy based on cone-beam computed tomography (CBCT) requires high CT number accuracy to ensure accurate dose calculations. Recently, deep learning has been proposed for fast CBCT artefact corrections on single anatomical sites. This study investigated the feasibility of applying a single convolutional network to facilitate dose calculation based on CBCT for head-and-neck, lung and breast cancer patients. Materials and Methods Ninety-nine patients diagnosed with head-and-neck, lung or breast cancer undergoing radiotherapy with CBCT-based position verification were included in this study. The CBCTs were registered to planning CT according to clinical procedures. Three cycle-consistent generative adversarial networks (cycle-GANs) were trained in an unpaired manner on 15 patients per anatomical site generating synthetic-CTs (sCTs). Another network was trained with all the anatomical sites together. Performances of all four networks were compared and evaluated for image similarity against rescan CT (rCT). Clinical plans were recalculated on rCT and sCT and analysed through voxel-based dose differences and γ-analysis. Results A sCT was generated in 10 s. Image similarity was comparable between models trained on different anatomical sites and a single model for all sites. Mean dose differences <0.5% were obtained in high-dose regions. Mean gamma (3%, 3 mm) pass-rates >95% were achieved for all sites. Conclusion Cycle-GAN reduced CBCT artefacts and increased similarity to CT, enabling sCT-based dose calculations. A single network achieved CBCT-based dose calculation generating synthetic CT for head-and-neck, lung, and breast cancer patients with similar performance to a network specifically trained for each anatomical site.
Collapse
Affiliation(s)
- Matteo Maspero
- Department of radiotherapy, division of imaging & oncology, University Medical Center Utrecht, Heidelberglaan 100, 3508 GA Utrecht, The Netherlands.,Computational imaging group for MR diagnostics & therapy, center for image sciences, University Medical Center Utrecht, Heidelberglaan 100, 3508 GA Utrecht, The Netherlands
| | - Antonetta C Houweling
- Department of radiotherapy, division of imaging & oncology, University Medical Center Utrecht, Heidelberglaan 100, 3508 GA Utrecht, The Netherlands
| | - Mark H F Savenije
- Department of radiotherapy, division of imaging & oncology, University Medical Center Utrecht, Heidelberglaan 100, 3508 GA Utrecht, The Netherlands.,Computational imaging group for MR diagnostics & therapy, center for image sciences, University Medical Center Utrecht, Heidelberglaan 100, 3508 GA Utrecht, The Netherlands
| | - Tristan C F van Heijst
- Department of radiotherapy, division of imaging & oncology, University Medical Center Utrecht, Heidelberglaan 100, 3508 GA Utrecht, The Netherlands
| | - Joost J C Verhoeff
- Department of radiotherapy, division of imaging & oncology, University Medical Center Utrecht, Heidelberglaan 100, 3508 GA Utrecht, The Netherlands
| | - Alexis N T J Kotte
- Department of radiotherapy, division of imaging & oncology, University Medical Center Utrecht, Heidelberglaan 100, 3508 GA Utrecht, The Netherlands
| | - Cornelis A T van den Berg
- Department of radiotherapy, division of imaging & oncology, University Medical Center Utrecht, Heidelberglaan 100, 3508 GA Utrecht, The Netherlands.,Computational imaging group for MR diagnostics & therapy, center for image sciences, University Medical Center Utrecht, Heidelberglaan 100, 3508 GA Utrecht, The Netherlands
| |
Collapse
|
19
|
Kaji S, Kida S. Overview of image-to-image translation by use of deep neural networks: denoising, super-resolution, modality conversion, and reconstruction in medical imaging. Radiol Phys Technol 2019; 12:235-48. [PMID: 31222562 DOI: 10.1007/s12194-019-00520-y] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2019] [Revised: 06/11/2019] [Accepted: 06/12/2019] [Indexed: 10/26/2022]
Abstract
Since the advent of deep convolutional neural networks (DNNs), computer vision has seen an extremely rapid progress that has led to huge advances in medical imaging. Every year, many new methods are reported at conferences such as the International Conference on Medical Image Computing and Computer-Assisted Intervention and Machine Learning for Medical Image Reconstruction, or published online at the preprint server arXiv. There is a plethora of surveys on applications of neural networks in medical imaging (see [1] for a relatively recent comprehensive survey). This article does not aim to cover all aspects of the field, but focuses on a particular topic, image-to-image translation. Although the topic may not sound familiar, it turns out that many seemingly irrelevant applications can be understood as instances of image-to-image translation. Such applications include (1) noise reduction, (2) super-resolution, (3) image synthesis, and (4) reconstruction. The same underlying principles and algorithms work for various tasks. Our aim is to introduce some of the key ideas on this topic from a uniform viewpoint. We introduce core ideas and jargon that are specific to image processing by use of DNNs. Having an intuitive grasp of the core ideas of applications of neural networks in medical imaging and a knowledge of technical terms would be of great help to the reader for understanding the existing and future applications. Most of the recent applications which build on image-to-image translation are based on one of two fundamental architectures, called pix2pix and CycleGAN, depending on whether the available training data are paired or unpaired (see Sect. 1.3). We provide codes ([2, 3]) which implement these two architectures with various enhancements. Our codes are available online with use of the very permissive MIT license. We provide a hands-on tutorial for training a model for denoising based on our codes (see Sect. 6). We hope that this article, together with the codes, will provide both an overview and the details of the key algorithms and that it will serve as a basis for the development of new applications.
Collapse
|