51
|
Zhou T, Zhang X, Lu H, Li Q, Liu L, Zhou H. GMRE-iUnet: Isomorphic Unet fusion model for PET and CT lung tumor images. Comput Biol Med 2023; 166:107514. [PMID: 37826951 DOI: 10.1016/j.compbiomed.2023.107514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2023] [Revised: 08/25/2023] [Accepted: 09/19/2023] [Indexed: 10/14/2023]
Abstract
Lung tumor PET and CT image fusion is a key technology in clinical diagnosis. However, the existing fusion methods are difficult to obtain fused images with high contrast, prominent morphological features, and accurate spatial localization. In this paper, an isomorphic Unet fusion model (GMRE-iUnet) for lung tumor PET and CT images is proposed to address the above problems. The main idea of this network is as following: Firstly, this paper constructs an isomorphic Unet fusion network, which contains two independent multiscale dual encoders Unet, it can capture the features of the lesion region, spatial localization, and enrich the morphological information. Secondly, a Hybrid CNN-Transformer feature extraction module (HCTrans) is constructed to effectively integrate local lesion features and global contextual information. In addition, the residual axial attention feature compensation module (RAAFC) is embedded into the Unet to capture fine-grained information as compensation features, which makes the model focus on local connections in neighboring pixels. Thirdly, a hybrid attentional feature fusion module (HAFF) is designed for multiscale feature information fusion, it aggregates edge information and detail representations using local entropy and Gaussian filtering. Finally, the experiment results on the multimodal lung tumor medical image dataset show that the model in this paper can achieve excellent fusion performance compared with other eight fusion models. In CT mediastinal window images and PET images comparison experiment, AG, EI, QAB/F, SF, SD, and IE indexes are improved by 16.19%, 26%, 3.81%, 1.65%, 3.91% and 8.01%, respectively. GMRE-iUnet can highlight the information and morphological features of the lesion areas and provide practical help for the aided diagnosis of lung tumors.
Collapse
Affiliation(s)
- Tao Zhou
- School of Computer Science and Engineering, North Minzu University, Yinchuan, 750021, China; Key Laboratory of Image and Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan, 750021, China
| | - Xiangxiang Zhang
- School of Computer Science and Engineering, North Minzu University, Yinchuan, 750021, China.
| | - Huiling Lu
- School of Medical Information & Engineering, Ningxia Medical University, Yinchuan, 750004, China.
| | - Qi Li
- School of Computer Science and Engineering, North Minzu University, Yinchuan, 750021, China
| | - Long Liu
- School of Computer Science and Engineering, North Minzu University, Yinchuan, 750021, China
| | - Huiyu Zhou
- School of Computing and Mathematical Sciences, University of Leicester, LE1 7RH, United Kingdom
| |
Collapse
|
52
|
Zhang Y, Li X, Ji Y, Ding H, Suo X, He X, Xie Y, Liang M, Zhang S, Yu C, Qin W. MRAβ: A multimodal MRI-derived amyloid-β biomarker for Alzheimer's disease. Hum Brain Mapp 2023; 44:5139-5152. [PMID: 37578386 PMCID: PMC10502620 DOI: 10.1002/hbm.26452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 04/30/2023] [Accepted: 08/01/2023] [Indexed: 08/15/2023] Open
Abstract
Florbetapir 18 F (AV45), a highly sensitive and specific positron emission tomographic (PET) molecular biomarker binding to the amyloid-β of Alzheimer's disease (AD), is constrained by radiation and cost. We sought to combat it by combining multimodal magnetic resonance imaging (MRI) images and a collaborative generative adversarial networks model (CollaGAN) to develop a multimodal MRI-derived Amyloid-β (MRAβ) biomarker. We collected multimodal MRI and PET AV45 data of 380 qualified participants from the ADNI dataset and 64 subjects from OASIS3 dataset. A five-fold cross-validation CollaGAN were applied to generate MRAβ. In the ADNI dataset, we found MRAβ could characterize the subject-level AV45 spatial variations in both AD and mild cognitive impairment (MCI). Voxel-wise two-sample t-tests demonstrated amyloid-β depositions identified by MRAβ in AD and MCI were significantly higher than healthy controls (HCs) in widespread cortices (p < .05, corrected) and were much similar to those by AV45 (r > .92, p < .001). Moreover, a 3D ResNet classifier demonstrated that MRAβ was comparable to AV45 in discriminating AD from HC in both the ADNI and OASIS3 datasets, and in discriminate MCI from HC in ADNI. Finally, we found MRAβ could mimic cortical hyper-AV45 in HCs who later converted to MCI (r = .79, p < .001) and was comparable to AV45 in discriminating them from stable HC (p > .05). In summary, our work illustrates that MRAβ synthesized by multimodal MRI could mimic the cerebral amyloid-β depositions like AV45 and lends credence to the feasibility of advancing MRI toward molecular-explainable biomarkers.
Collapse
Affiliation(s)
- Yu Zhang
- Department of Radiology and Tianjin Key Lab of Functional ImagingTianjin Medical University General HospitalTianjinChina
| | - Xi Li
- Department of Radiology and Tianjin Key Lab of Functional ImagingTianjin Medical University General HospitalTianjinChina
- Department of RadiologyFirst Clinical Medical College and First Hospital of Shanxi Medical UniversityTaiyuanShanxi ProvinceChina
| | - Yi Ji
- Department of Radiology and Tianjin Key Lab of Functional ImagingTianjin Medical University General HospitalTianjinChina
| | - Hao Ding
- Department of Radiology and Tianjin Key Lab of Functional ImagingTianjin Medical University General HospitalTianjinChina
- School of Medical ImagingTianjin Medical UniversityTianjinChina
| | - Xinjun Suo
- Department of Radiology and Tianjin Key Lab of Functional ImagingTianjin Medical University General HospitalTianjinChina
| | - Xiaoxi He
- Department of Radiology and Tianjin Key Lab of Functional ImagingTianjin Medical University General HospitalTianjinChina
| | - Yingying Xie
- Department of Radiology and Tianjin Key Lab of Functional ImagingTianjin Medical University General HospitalTianjinChina
| | - Meng Liang
- School of Medical ImagingTianjin Medical UniversityTianjinChina
| | - Shijie Zhang
- Department of PharmacologyTianjin Medical UniversityTianjinChina
| | - Chunshui Yu
- Department of Radiology and Tianjin Key Lab of Functional ImagingTianjin Medical University General HospitalTianjinChina
- School of Medical ImagingTianjin Medical UniversityTianjinChina
| | - Wen Qin
- Department of Radiology and Tianjin Key Lab of Functional ImagingTianjin Medical University General HospitalTianjinChina
| |
Collapse
|
53
|
Dorent R, Haouchine N, Kogl F, Joutard S, Juvekar P, Torio E, Golby A, Ourselin S, Frisken S, Vercauteren T, Kapur T, Wells WM. Unified Brain MR-Ultrasound Synthesis using Multi-Modal Hierarchical Representations. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION : MICCAI ... INTERNATIONAL CONFERENCE ON MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION 2023; 2023:448-458. [PMID: 38655383 PMCID: PMC7615858 DOI: 10.1007/978-3-031-43999-5_43] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
We introduce MHVAE, a deep hierarchical variational autoencoder (VAE) that synthesizes missing images from various modalities. Extending multi-modal VAEs with a hierarchical latent structure, we introduce a probabilistic formulation for fusing multi-modal images in a common latent representation while having the flexibility to handle incomplete image sets as input. Moreover, adversarial learning is employed to generate sharper images. Extensive experiments are performed on the challenging problem of joint intra-operative ultrasound (iUS) and Magnetic Resonance (MR) synthesis. Our model outperformed multi-modal VAEs, conditional GANs, and the current state-of-the-art unified method (ResViT) for synthesizing missing images, demonstrating the advantage of using a hierarchical latent representation and a principled probabilistic fusion operation. Our code is publicly available.
Collapse
Affiliation(s)
- Reuben Dorent
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | - Nazim Haouchine
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | - Fryderyk Kogl
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | | | - Parikshit Juvekar
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | - Erickson Torio
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | - Alexandra Golby
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | | | - Sarah Frisken
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | | | - Tina Kapur
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
| | - William M Wells
- Harvard Medical School, Brigham and Women's Hospital, Boston, MA, USA
- Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
54
|
Liu L, Liu Z, Chang J, Qiao H, Sun T, Shang J. MGGAN: A multi-generator generative adversarial network for breast cancer immunohistochemical image generation. Heliyon 2023; 9:e20614. [PMID: 37860562 PMCID: PMC10582479 DOI: 10.1016/j.heliyon.2023.e20614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2023] [Revised: 10/02/2023] [Accepted: 10/02/2023] [Indexed: 10/21/2023] Open
Abstract
The immunohistochemical technique (IHC) is widely used for evaluating diagnostic markers, but it can be expensive to obtain IHC-stained section. Translating the cheap and easily available hematoxylin and eosin (HE) images into IHC images provides a solution to this challenge. In this paper, we propose a multi-generator generative adversarial network (MGGAN) that can generate high-quality IHC images based on the HE of breast cancer. Our MGGAN approach combines the low-frequency and high-frequency components of the HE image to improve the translation of breast cancer image details. We use the multi-generator to extract semantic information and a U-shaped architecture and patch-based discriminator to collect and optimize the low-frequency and high-frequency components of an image. We also include a cross-entropy loss as a regularization term in the loss function to ensure consistency between the synthesized image and the real image. Our experimental and visualization results demonstrate that our method outperforms other state-of-the-art image synthesis methods in terms of both quantitative and qualitative analysis. Our approach provides a cost-effective and efficient solution for obtaining high-quality IHC images.
Collapse
Affiliation(s)
- Liangliang Liu
- College of Information and Management Science, Henan Agricultural University, Zhengzhou, Henan 450046, PR China
| | - Zhihong Liu
- College of Information and Management Science, Henan Agricultural University, Zhengzhou, Henan 450046, PR China
| | - Jing Chang
- College of Information and Management Science, Henan Agricultural University, Zhengzhou, Henan 450046, PR China
| | - Hongbo Qiao
- College of Information and Management Science, Henan Agricultural University, Zhengzhou, Henan 450046, PR China
| | - Tong Sun
- College of Information and Management Science, Henan Agricultural University, Zhengzhou, Henan 450046, PR China
| | - Junping Shang
- College of Information and Management Science, Henan Agricultural University, Zhengzhou, Henan 450046, PR China
| |
Collapse
|
55
|
Diao Y, Li F, Li Z. Joint learning-based feature reconstruction and enhanced network for incomplete multi-modal brain tumor segmentation. Comput Biol Med 2023; 163:107234. [PMID: 37450967 DOI: 10.1016/j.compbiomed.2023.107234] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 06/12/2023] [Accepted: 07/01/2023] [Indexed: 07/18/2023]
Abstract
Multimodal Magnetic Resonance Imaging (MRI) can provide valuable complementary information and substantially enhance the performance of brain tumor segmentation. However, it is common for certain modalities to be absent or missing during clinical diagnosis, which can significantly impair segmentation techniques that rely on complete modalities. Current advanced methods attempt to address this challenge by developing shared feature representations via modal fusion to handle different missing modality situations. Considering the importance of missing modality information in multimodal segmentation, this paper utilize a feature reconstruction method to recover the missing information, and proposes a joint learning-based feature reconstruction and enhancement method for incomplete modality brain tumor segmentation. The method leverages an information learning mechanism to transfer information from the complete modality to a single modality, enabling it to obtain complete brain tumor information, even without the support of other modalities. Additionally, the method incorporates a module for reconstructing missing modality features, which recovers fused features of the absent modality through utilizing the abundant potential information obtained from the available modalities. Furthermore, the feature enhancement mechanism improves shared feature representation by utilizing the information obtained from the missing modalities that have been reconstructed. These processes enable the method to obtain more comprehensive information regarding brain tumors in various missing modality circumstances, thereby enhancing the model's robustness. The performance of the proposed model was evaluated on BraTS datasets and compared with other deep learning algorithms using Dice similarity scores. On the BraTS2018 dataset, the proposed algorithm achieved a Dice similarity score of 86.28%, 77.02%, and 59.64% for whole tumors, tumor cores, and enhanced tumors, respectively. These results demonstrate the superiority of our framework over state-of-the-art methods in missing modalities situations.
Collapse
Affiliation(s)
- Yueqin Diao
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China; Yunnan Key Laboratory of Artificial Intelligence, Kunming 650500, China.
| | - Fan Li
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China; Yunnan Key Laboratory of Artificial Intelligence, Kunming 650500, China.
| | - Zhiyuan Li
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China; Yunnan Key Laboratory of Artificial Intelligence, Kunming 650500, China.
| |
Collapse
|
56
|
Liu J, Pasumarthi S, Duffy B, Gong E, Datta K, Zaharchuk G. One Model to Synthesize Them All: Multi-Contrast Multi-Scale Transformer for Missing Data Imputation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:2577-2591. [PMID: 37030684 PMCID: PMC10543020 DOI: 10.1109/tmi.2023.3261707] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Multi-contrast magnetic resonance imaging (MRI) is widely used in clinical practice as each contrast provides complementary information. However, the availability of each imaging contrast may vary amongst patients, which poses challenges to radiologists and automated image analysis algorithms. A general approach for tackling this problem is missing data imputation, which aims to synthesize the missing contrasts from existing ones. While several convolutional neural networks (CNN) based algorithms have been proposed, they suffer from the fundamental limitations of CNN models, such as the requirement for fixed numbers of input and output channels, the inability to capture long-range dependencies, and the lack of interpretability. In this work, we formulate missing data imputation as a sequence-to-sequence learning problem and propose a multi-contrast multi-scale Transformer (MMT), which can take any subset of input contrasts and synthesize those that are missing. MMT consists of a multi-scale Transformer encoder that builds hierarchical representations of inputs combined with a multi-scale Transformer decoder that generates the outputs in a coarse-to-fine fashion. The proposed multi-contrast Swin Transformer blocks can efficiently capture intra- and inter-contrast dependencies for accurate image synthesis. Moreover, MMT is inherently interpretable as it allows us to understand the importance of each input contrast in different regions by analyzing the in-built attention maps of Transformer blocks in the decoder. Extensive experiments on two large-scale multi-contrast MRI datasets demonstrate that MMT outperforms the state-of-the-art methods quantitatively and qualitatively.
Collapse
|
57
|
Li T, Wang J, Yang Y, Glide-Hurst CK, Wen N, Cai J. Multi-parametric MRI for radiotherapy simulation. Med Phys 2023; 50:5273-5293. [PMID: 36710376 PMCID: PMC10382603 DOI: 10.1002/mp.16256] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Revised: 09/10/2022] [Accepted: 12/06/2022] [Indexed: 01/31/2023] Open
Abstract
Magnetic resonance imaging (MRI) has become an important imaging modality in the field of radiotherapy (RT) in the past decade, especially with the development of various novel MRI and image-guidance techniques. In this review article, we will describe recent developments and discuss the applications of multi-parametric MRI (mpMRI) in RT simulation. In this review, mpMRI refers to a general and loose definition which includes various multi-contrast MRI techniques. Specifically, we will focus on the implementation, challenges, and future directions of mpMRI techniques for RT simulation.
Collapse
Affiliation(s)
- Tian Li
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong, China
| | - Jihong Wang
- Department of Radiation Physics, Division of Radiation Oncology, MD Anderson Cancer Center, Houston, Texas, USA
| | - Yingli Yang
- Department of Radiology, Ruijin Hospital, Shanghai Jiaotong Univeristy School of Medicine, Shanghai, China
- SJTU-Ruijing-UIH Institute for Medical Imaging Technology, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Carri K Glide-Hurst
- Department of Radiation Oncology, University of Wisconsin, Madison, Wisconsin, USA
| | - Ning Wen
- Department of Radiology, Ruijin Hospital, Shanghai Jiaotong Univeristy School of Medicine, Shanghai, China
- SJTU-Ruijing-UIH Institute for Medical Imaging Technology, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
- The Global Institute of Future Technology, Shanghai Jiaotong University, Shanghai, China
| | - Jing Cai
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong, China
| |
Collapse
|
58
|
Liu X, Prince JL, Xing F, Zhuo J, Reese T, Stone M, El Fakhri G, Woo J. Attentive continuous generative self-training for unsupervised domain adaptive medical image translation. Med Image Anal 2023; 88:102851. [PMID: 37329854 PMCID: PMC10527936 DOI: 10.1016/j.media.2023.102851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 03/28/2023] [Accepted: 05/23/2023] [Indexed: 06/19/2023]
Abstract
Self-training is an important class of unsupervised domain adaptation (UDA) approaches that are used to mitigate the problem of domain shift, when applying knowledge learned from a labeled source domain to unlabeled and heterogeneous target domains. While self-training-based UDA has shown considerable promise on discriminative tasks, including classification and segmentation, through reliable pseudo-label filtering based on the maximum softmax probability, there is a paucity of prior work on self-training-based UDA for generative tasks, including image modality translation. To fill this gap, in this work, we seek to develop a generative self-training (GST) framework for domain adaptive image translation with continuous value prediction and regression objectives. Specifically, we quantify both aleatoric and epistemic uncertainties within our GST using variational Bayes learning to measure the reliability of synthesized data. We also introduce a self-attention scheme that de-emphasizes the background region to prevent it from dominating the training process. The adaptation is then carried out by an alternating optimization scheme with target domain supervision that focuses attention on the regions with reliable pseudo-labels. We evaluated our framework on two cross-scanner/center, inter-subject translation tasks, including tagged-to-cine magnetic resonance (MR) image translation and T1-weighted MR-to-fractional anisotropy translation. Extensive validations with unpaired target domain data showed that our GST yielded superior synthesis performance in comparison to adversarial training UDA methods.
Collapse
Affiliation(s)
- Xiaofeng Liu
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114, USA.
| | - Jerry L Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Fangxu Xing
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114, USA
| | - Jiachen Zhuo
- Department of Neural and Pain Sciences, University of Maryland School of Dentistry, Baltimore, MD, USA
| | - Timothy Reese
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Maureen Stone
- Department of Neural and Pain Sciences, University of Maryland School of Dentistry, Baltimore, MD, USA
| | - Georges El Fakhri
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114, USA
| | - Jonghye Woo
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114, USA
| |
Collapse
|
59
|
Kazerouni A, Aghdam EK, Heidari M, Azad R, Fayyaz M, Hacihaliloglu I, Merhof D. Diffusion models in medical imaging: A comprehensive survey. Med Image Anal 2023; 88:102846. [PMID: 37295311 DOI: 10.1016/j.media.2023.102846] [Citation(s) in RCA: 102] [Impact Index Per Article: 51.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 05/12/2023] [Accepted: 05/16/2023] [Indexed: 06/12/2023]
Abstract
Denoising diffusion models, a class of generative models, have garnered immense interest lately in various deep-learning problems. A diffusion probabilistic model defines a forward diffusion stage where the input data is gradually perturbed over several steps by adding Gaussian noise and then learns to reverse the diffusion process to retrieve the desired noise-free data from noisy data samples. Diffusion models are widely appreciated for their strong mode coverage and quality of the generated samples in spite of their known computational burdens. Capitalizing on the advances in computer vision, the field of medical imaging has also observed a growing interest in diffusion models. With the aim of helping the researcher navigate this profusion, this survey intends to provide a comprehensive overview of diffusion models in the discipline of medical imaging. Specifically, we start with an introduction to the solid theoretical foundation and fundamental concepts behind diffusion models and the three generic diffusion modeling frameworks, namely, diffusion probabilistic models, noise-conditioned score networks, and stochastic differential equations. Then, we provide a systematic taxonomy of diffusion models in the medical domain and propose a multi-perspective categorization based on their application, imaging modality, organ of interest, and algorithms. To this end, we cover extensive applications of diffusion models in the medical domain, including image-to-image translation, reconstruction, registration, classification, segmentation, denoising, 2/3D generation, anomaly detection, and other medically-related challenges. Furthermore, we emphasize the practical use case of some selected approaches, and then we discuss the limitations of the diffusion models in the medical domain and propose several directions to fulfill the demands of this field. Finally, we gather the overviewed studies with their available open-source implementations at our GitHub.1 We aim to update the relevant latest papers within it regularly.
Collapse
Affiliation(s)
- Amirhossein Kazerouni
- School of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran
| | | | - Moein Heidari
- School of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran
| | - Reza Azad
- Faculty of Electrical Engineering and Information Technology, RWTH Aachen University, Aachen, Germany
| | | | - Ilker Hacihaliloglu
- Department of Radiology, University of British Columbia, Vancouver, Canada; Department of Medicine, University of British Columbia, Vancouver, Canada
| | - Dorit Merhof
- Faculty of Informatics and Data Science, University of Regensburg, Regensburg, Germany; Fraunhofer Institute for Digital Medicine MEVIS, Bremen, Germany.
| |
Collapse
|
60
|
Jiao C, Ling D, Bian S, Vassantachart A, Cheng K, Mehta S, Lock D, Zhu Z, Feng M, Thomas H, Scholey JE, Sheng K, Fan Z, Yang W. Contrast-Enhanced Liver Magnetic Resonance Image Synthesis Using Gradient Regularized Multi-Modal Multi-Discrimination Sparse Attention Fusion GAN. Cancers (Basel) 2023; 15:3544. [PMID: 37509207 PMCID: PMC10377331 DOI: 10.3390/cancers15143544] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 07/03/2023] [Accepted: 07/05/2023] [Indexed: 07/30/2023] Open
Abstract
PURPOSES To provide abdominal contrast-enhanced MR image synthesis, we developed an gradient regularized multi-modal multi-discrimination sparse attention fusion generative adversarial network (GRMM-GAN) to avoid repeated contrast injections to patients and facilitate adaptive monitoring. METHODS With IRB approval, 165 abdominal MR studies from 61 liver cancer patients were retrospectively solicited from our institutional database. Each study included T2, T1 pre-contrast (T1pre), and T1 contrast-enhanced (T1ce) images. The GRMM-GAN synthesis pipeline consists of a sparse attention fusion network, an image gradient regularizer (GR), and a generative adversarial network with multi-discrimination. The studies were randomly divided into 115 for training, 20 for validation, and 30 for testing. The two pre-contrast MR modalities, T2 and T1pre images, were adopted as inputs in the training phase. The T1ce image at the portal venous phase was used as an output. The synthesized T1ce images were compared with the ground truth T1ce images. The evaluation metrics include peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and mean squared error (MSE). A Turing test and experts' contours evaluated the image synthesis quality. RESULTS The proposed GRMM-GAN model achieved a PSNR of 28.56, an SSIM of 0.869, and an MSE of 83.27. The proposed model showed statistically significant improvements in all metrics tested with p-values < 0.05 over the state-of-the-art model comparisons. The average Turing test score was 52.33%, which is close to random guessing, supporting the model's effectiveness for clinical application. In the tumor-specific region analysis, the average tumor contrast-to-noise ratio (CNR) of the synthesized MR images was not statistically significant from the real MR images. The average DICE from real vs. synthetic images was 0.90 compared to the inter-operator DICE of 0.91. CONCLUSION We demonstrated the function of a novel multi-modal MR image synthesis neural network GRMM-GAN for T1ce MR synthesis based on pre-contrast T1 and T2 MR images. GRMM-GAN shows promise for avoiding repeated contrast injections during radiation therapy treatment.
Collapse
Affiliation(s)
- Changzhe Jiao
- Department of Radiation Oncology, Keck School of Medicine of USC, Los Angeles, CA 90033, USA (A.V.); (S.M.)
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143, USA
| | - Diane Ling
- Department of Radiation Oncology, Keck School of Medicine of USC, Los Angeles, CA 90033, USA (A.V.); (S.M.)
| | - Shelly Bian
- Department of Radiation Oncology, Keck School of Medicine of USC, Los Angeles, CA 90033, USA (A.V.); (S.M.)
| | - April Vassantachart
- Department of Radiation Oncology, Keck School of Medicine of USC, Los Angeles, CA 90033, USA (A.V.); (S.M.)
| | - Karen Cheng
- Department of Radiation Oncology, Keck School of Medicine of USC, Los Angeles, CA 90033, USA (A.V.); (S.M.)
| | - Shahil Mehta
- Department of Radiation Oncology, Keck School of Medicine of USC, Los Angeles, CA 90033, USA (A.V.); (S.M.)
| | - Derrick Lock
- Department of Radiation Oncology, Keck School of Medicine of USC, Los Angeles, CA 90033, USA (A.V.); (S.M.)
| | - Zhenyu Zhu
- Guangzhou Institute of Technology, Xidian University, Guangzhou 510555, China;
| | - Mary Feng
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143, USA
| | - Horatio Thomas
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143, USA
| | - Jessica E. Scholey
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143, USA
| | - Ke Sheng
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143, USA
| | - Zhaoyang Fan
- Department of Radiology, Keck School of Medicine of USC, Los Angeles, CA 90033, USA
| | - Wensha Yang
- Department of Radiation Oncology, Keck School of Medicine of USC, Los Angeles, CA 90033, USA (A.V.); (S.M.)
- Department of Radiation Oncology, UC San Francisco, San Francisco, CA 94143, USA
| |
Collapse
|
61
|
Yang H, Zhou T, Zhou Y, Zhang Y, Fu H. Flexible Fusion Network for Multi-Modal Brain Tumor Segmentation. IEEE J Biomed Health Inform 2023; 27:3349-3359. [PMID: 37126623 DOI: 10.1109/jbhi.2023.3271808] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Automated brain tumor segmentation is crucial for aiding brain disease diagnosis and evaluating disease progress. Currently, magnetic resonance imaging (MRI) is a routinely adopted approach in the field of brain tumor segmentation that can provide different modality images. It is critical to leverage multi-modal images to boost brain tumor segmentation performance. Existing works commonly concentrate on generating a shared representation by fusing multi-modal data, while few methods take into account modality-specific characteristics. Besides, how to efficiently fuse arbitrary numbers of modalities is still a difficult task. In this study, we present a flexible fusion network (termed F 2Net) for multi-modal brain tumor segmentation, which can flexibly fuse arbitrary numbers of multi-modal information to explore complementary information while maintaining the specific characteristics of each modality. Our F 2Net is based on the encoder-decoder structure, which utilizes two Transformer-based feature learning streams and a cross-modal shared learning network to extract individual and shared feature representations. To effectively integrate the knowledge from the multi-modality data, we propose a cross-modal feature-enhanced module (CFM) and a multi-modal collaboration module (MCM), which aims at fusing the multi-modal features into the shared learning network and incorporating the features from encoders into the shared decoder, respectively. Extensive experimental results on multiple benchmark datasets demonstrate the effectiveness of our F 2Net over other state-of-the-art segmentation methods.
Collapse
|
62
|
Yang J, Li XX, Liu F, Nie D, Lio P, Qi H, Shen D. Fast Multi-Contrast MRI Acquisition by Optimal Sampling of Information Complementary to Pre-Acquired MRI Contrast. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:1363-1373. [PMID: 37015608 DOI: 10.1109/tmi.2022.3227262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Recent studies on multi-contrast MRI reconstruction have demonstrated the potential of further accelerating MRI acquisition by exploiting correlation between contrasts. Most of the state-of-the-art approaches have achieved improvement through the development of network architectures for fixed under-sampling patterns, without considering inter-contrast correlation in the under-sampling pattern design. On the other hand, sampling pattern learning methods have shown better reconstruction performance than those with fixed under-sampling patterns. However, most under-sampling pattern learning algorithms are designed for single contrast MRI without exploiting complementary information between contrasts. To this end, we propose a framework to optimize the under-sampling pattern of a target MRI contrast which complements the acquired fully-sampled reference contrast. Specifically, a novel image synthesis network is introduced to extract the redundant information contained in the reference contrast, which is exploited in the subsequent joint pattern optimization and reconstruction network. We have demonstrated superior performance of our learned under-sampling patterns on both public and in-house datasets, compared to the commonly used under-sampling patterns and state-of-the-art methods that jointly optimize the reconstruction network and the under-sampling patterns, up to 8-fold under-sampling factor.
Collapse
|
63
|
Touati R, Kadoury S. A least square generative network based on invariant contrastive feature pair learning for multimodal MR image synthesis. Int J Comput Assist Radiol Surg 2023:10.1007/s11548-023-02916-z. [PMID: 37103727 DOI: 10.1007/s11548-023-02916-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 04/12/2023] [Indexed: 04/28/2023]
Abstract
PURPOSE During MR-guided neurosurgical procedures, several factors may limit the acquisition of additional MR sequences, which are needed by neurosurgeons to adjust surgical plans or ensure complete tumor resection. Automatically synthesized MR contrasts generated from other available heterogeneous MR sequences could alleviate timing constraints. METHODS We propose a new multimodal MR synthesis approach leveraging a combination of MR modalities presenting glioblastomas to generate an additional modality. The proposed learning approach relies on a least square GAN (LSGAN) using an unsupervised contrastive learning strategy. We incorporate a contrastive encoder, which extracts an invariant contrastive representation from augmented pairs of the generated and real target MR contrasts. This contrastive representation describes a pair of features for each input channel, allowing to regularize the generator to be invariant to the high-frequency orientations. Moreover, when training the generator, we impose on the LSGAN loss another term reformulated as the combination of a reconstruction and a novel perception loss based on a pair of features. RESULTS When compared to other multimodal MR synthesis approaches evaluated on the BraTS'18 brain dataset, the model yields the highest Dice score with [Formula: see text] and achieves the lowest variability information of [Formula: see text], with a probability rand index score of [Formula: see text] and a global consistency error of [Formula: see text]. CONCLUSION The proposed model allows to generate reliable MR contrasts with enhanced tumors on the synthesized image using a brain tumor dataset (BraTS'18). In future work, we will perform a clinical evaluation of residual tumor segmentations during MR-guided neurosurgeries, where limited MR contrasts will be acquired during the procedure.
Collapse
Affiliation(s)
- Redha Touati
- Polytechnique Montréal, Montreal, QC, H3T 1J4, Canada.
| | - Samuel Kadoury
- Polytechnique Montréal, Montreal, QC, H3T 1J4, Canada
- CHUM, Université de Montréal, Montreal, H2X 0A9, Canada
| |
Collapse
|
64
|
Xia Y, Ravikumar N, Lassila T, Frangi AF. Virtual high-resolution MR angiography from non-angiographic multi-contrast MRIs: synthetic vascular model populations for in-silico trials. Med Image Anal 2023; 87:102814. [PMID: 37196537 DOI: 10.1016/j.media.2023.102814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 04/04/2023] [Accepted: 04/08/2023] [Indexed: 05/19/2023]
Abstract
Despite success on multi-contrast MR image synthesis, generating specific modalities remains challenging. Those include Magnetic Resonance Angiography (MRA) that highlights details of vascular anatomy using specialised imaging sequences for emphasising inflow effect. This work proposes an end-to-end generative adversarial network that can synthesise anatomically plausible, high-resolution 3D MRA images using commonly acquired multi-contrast MR images (e.g. T1/T2/PD-weighted MR images) for the same subject whilst preserving the continuity of vascular anatomy. A reliable technique for MRA synthesis would unleash the research potential of very few population databases with imaging modalities (such as MRA) that enable quantitative characterisation of whole-brain vasculature. Our work is motivated by the need to generate digital twins and virtual patients of cerebrovascular anatomy for in-silico studies and/or in-silico trials. We propose a dedicated generator and discriminator that leverage the shared and complementary features of multi-source images. We design a composite loss function for emphasising vascular properties by minimising the statistical difference between the feature representations of the target images and the synthesised outputs in both 3D volumetric and 2D projection domains. Experimental results show that the proposed method can synthesise high-quality MRA images and outperform the state-of-the-art generative models both qualitatively and quantitatively. The importance assessment reveals that T2 and PD-weighted images are better predictors of MRA images than T1; and PD-weighted images contribute to better visibility of small vessel branches towards the peripheral regions. In addition, the proposed approach can generalise to unseen data acquired at different imaging centres with different scanners, whilst synthesising MRAs and vascular geometries that maintain vessel continuity. The results show the potential for use of the proposed approach to generating digital twin cohorts of cerebrovascular anatomy at scale from structural MR images typically acquired in population imaging initiatives.
Collapse
Affiliation(s)
- Yan Xia
- Centre for Computational Imaging and Simulation Technologies in Biomedicine (CISTIB), School of Computing, University of Leeds, Leeds, UK.
| | - Nishant Ravikumar
- Centre for Computational Imaging and Simulation Technologies in Biomedicine (CISTIB), School of Computing, University of Leeds, Leeds, UK
| | - Toni Lassila
- Centre for Computational Imaging and Simulation Technologies in Biomedicine (CISTIB), School of Computing, University of Leeds, Leeds, UK
| | - Alejandro F Frangi
- Centre for Computational Imaging and Simulation Technologies in Biomedicine (CISTIB), School of Computing, University of Leeds, Leeds, UK; Leeds Institute for Cardiovascular and Metabolic Medicine (LICAMM), School of Medicine, University of Leeds, Leeds, UK; Medical Imaging Research Center (MIRC), Cardiovascular Science and Electronic Engineering Departments, KU Leuven, Leuven, Belgium; Alan Turing Institute, London, UK
| |
Collapse
|
65
|
Meng Z, Zhu Y, Pang W, Tian J, Nie F, Wang K. MSMFN: An Ultrasound Based Multi-Step Modality Fusion Network for Identifying the Histologic Subtypes of Metastatic Cervical Lymphadenopathy. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:996-1008. [PMID: 36383594 DOI: 10.1109/tmi.2022.3222541] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Identifying squamous cell carcinoma and adenocarcinoma subtypes of metastatic cervical lymphadenopathy (CLA) is critical for localizing the primary lesion and initiating timely therapy. B-mode ultrasound (BUS), color Doppler flow imaging (CDFI), ultrasound elastography (UE) and dynamic contrast-enhanced ultrasound provide effective tools for identification but synthesis of modality information is a challenge for clinicians. Therefore, based on deep learning, rationally fusing these modalities with clinical information to personalize the classification of metastatic CLA requires new explorations. In this paper, we propose Multi-step Modality Fusion Network (MSMFN) for multi-modal ultrasound fusion to identify histological subtypes of metastatic CLA. MSMFN can mine the unique features of each modality and fuse them in a hierarchical three-step process. Specifically, first, under the guidance of high-level BUS semantic feature maps, information in CDFI and UE is extracted by modality interaction, and the static imaging feature vector is obtained. Then, a self-supervised feature orthogonalization loss is introduced to help learn modality heterogeneity features while maintaining maximal task-consistent category distinguishability of modalities. Finally, six encoded clinical information are utilized to avoid prediction bias and improve prediction ability further. Our three-fold cross-validation experiments demonstrate that our method surpasses clinicians and other multi-modal fusion methods with an accuracy of 80.06%, a true-positive rate of 81.81%, and a true-negative rate of 80.00%. Our network provides a multi-modal ultrasound fusion framework that considers prior clinical knowledge and modality-specific characteristics. Our code will be available at: https://github.com/RichardSunnyMeng/MSMFN.
Collapse
|
66
|
Joseph J, Biji I, Babu N, Pournami PN, Jayaraj PB, Puzhakkal N, Sabu C, Patel V. Fan beam CT image synthesis from cone beam CT image using nested residual UNet based conditional generative adversarial network. Phys Eng Sci Med 2023; 46:703-717. [PMID: 36943626 DOI: 10.1007/s13246-023-01244-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Accepted: 03/09/2023] [Indexed: 03/23/2023]
Abstract
A radiotherapy technique called Image-Guided Radiation Therapy adopts frequent imaging throughout a treatment session. Fan Beam Computed Tomography (FBCT) based planning followed by Cone Beam Computed Tomography (CBCT) based radiation delivery drastically improved the treatment accuracy. Furtherance in terms of radiation exposure and cost can be achieved if FBCT could be replaced with CBCT. This paper proposes a Conditional Generative Adversarial Network (CGAN) for CBCT-to-FBCT synthesis. Specifically, a new architecture called Nested Residual UNet (NR-UNet) is introduced as the generator of the CGAN. A composite loss function, which comprises adversarial loss, Mean Squared Error (MSE), and Gradient Difference Loss (GDL), is used with the generator. The CGAN utilises the inter-slice dependency in the input by taking three consecutive CBCT slices to generate an FBCT slice. The model is trained using Head-and-Neck (H&N) FBCT-CBCT images of 53 cancer patients. The synthetic images exhibited a Peak Signal-to-Noise Ratio of 34.04±0.93 dB, Structural Similarity Index Measure of 0.9751±0.001 and a Mean Absolute Error of 14.81±4.70 HU. On average, the proposed model guarantees an improvement in Contrast-to-Noise Ratio four times better than the input CBCT images. The model also minimised the MSE and alleviated blurriness. Compared to the CBCT-based plan, the synthetic image results in a treatment plan closer to the FBCT-based plan. The three-slice to single-slice translation captures the three-dimensional contextual information in the input. Besides, it withstands the computational complexity associated with a three-dimensional image synthesis model. Furthermore, the results demonstrate that the proposed model is superior to the state-of-the-art methods.
Collapse
Affiliation(s)
- Jiffy Joseph
- Computer science and Engineering Department, National Institute of Technology Calicut, Kattangal, Calicut, Kerala, 673601, India.
| | - Ivan Biji
- Computer science and Engineering Department, National Institute of Technology Calicut, Kattangal, Calicut, Kerala, 673601, India
| | - Naveen Babu
- Computer science and Engineering Department, National Institute of Technology Calicut, Kattangal, Calicut, Kerala, 673601, India
| | - P N Pournami
- Computer science and Engineering Department, National Institute of Technology Calicut, Kattangal, Calicut, Kerala, 673601, India
| | - P B Jayaraj
- Computer science and Engineering Department, National Institute of Technology Calicut, Kattangal, Calicut, Kerala, 673601, India
| | - Niyas Puzhakkal
- Department of Medical Physics, MVR Cancer Centre & Research Institute, Poolacode, Calicut, Kerala, 673601, India
| | - Christy Sabu
- Computer science and Engineering Department, National Institute of Technology Calicut, Kattangal, Calicut, Kerala, 673601, India
| | - Vedkumar Patel
- Computer science and Engineering Department, National Institute of Technology Calicut, Kattangal, Calicut, Kerala, 673601, India
| |
Collapse
|
67
|
Li Y, Xu S, Chen H, Sun Y, Bian J, Guo S, Lu Y, Qi Z. CT synthesis from multi-sequence MRI using adaptive fusion network. Comput Biol Med 2023; 157:106738. [PMID: 36924728 DOI: 10.1016/j.compbiomed.2023.106738] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 02/09/2023] [Accepted: 03/01/2023] [Indexed: 03/13/2023]
Abstract
OBJECTIVE To investigate a method using multi-sequence magnetic resonance imaging (MRI) to synthesize computed tomography (CT) for MRI-only radiation therapy. APPROACH We proposed an adaptive multi-sequence fusion network (AMSF-Net) to exploit both voxel- and context-wise cross-sequence correlations from multiple MRI sequences to synthesize CT using element- and patch-wise fusions, respectively. The element- and patch-wise fusion feature spaces were combined, and the most representative features were selected for modeling. Finally, a densely connected convolutional decoder was applied to utilize the selected features to produce synthetic CT images. MAIN RESULTS This study includes a total number of 90 patients' T1-weighted MRI, T2-weighted MRI and CT data. The AMSF-Net reduced the average mean absolute error (MAE) from 52.88-57.23 to 49.15 HU, increased the peak signal-to-noise ratio (PSNR) from 24.82-25.32 to 25.63 dB, increased the structural similarity index measure (SSIM) from 0.857-0.869 to 0.878, and increased the dice coefficient of bone from 0.886-0.896 to 0.903 compared to the other three existing multi-sequence learning models. The improvements were statistically significant according to two-tailed paired t-test. In addition, AMSF-Net reduced the intensity difference with real CT in five organs at risk, four types of normal tissue and tumor compared with the baseline models. The MAE decreases in parotid and spinal cord were over 8% and 16% with reference to the mean intensity value of the corresponding organ, respectively. Further, the qualitative evaluations confirmed that AMSF-Net exhibited superior structural image quality of synthesized bone and small organs such as the eye lens. SIGNIFICANCE The proposed method can improve the intensity and structural image quality of synthetic CT and has potential for use in clinical applications.
Collapse
Affiliation(s)
- Yan Li
- School of Data and Computer Engineering, Sun Yat-sen University, Guangzhou, PR China
| | - Sisi Xu
- Department of Radiation Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital & Shenzhen Hospital, Chinese Academy of Medical Sciences, Peking Union Medical College, Shenzhen, PR China
| | | | - Ying Sun
- Department of Radiation Oncology, Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center of Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangzhou, PR China
| | - Jing Bian
- School of Data and Computer Engineering, Sun Yat-sen University, Guangzhou, PR China
| | - Shuanshuan Guo
- The Fifth Affiliated Hospital of Sun Yat-sen University, Cancer Center, Guangzhou, PR China.
| | - Yao Lu
- School of Computer Science and Engineering, Sun Yat-sen University, Guangdong Province Key Laboratory of Computational Science, Guangzhou, PR China.
| | - Zhenyu Qi
- Department of Radiation Oncology, Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center of Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangzhou, PR China.
| |
Collapse
|
68
|
Suresh K, Cohen MS, Hartnick CJ, Bartholomew RA, Lee DJ, Crowson MG. Generation of synthetic tympanic membrane images: Development, human validation, and clinical implications of synthetic data. PLOS DIGITAL HEALTH 2023; 2:e0000202. [PMID: 36827244 PMCID: PMC9956018 DOI: 10.1371/journal.pdig.0000202] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Accepted: 01/24/2023] [Indexed: 02/25/2023]
Abstract
Synthetic clinical images could augment real medical image datasets, a novel approach in otolaryngology-head and neck surgery (OHNS). Our objective was to develop a generative adversarial network (GAN) for tympanic membrane images and to validate the quality of synthetic images with human reviewers. Our model was developed using a state-of-the-art GAN architecture, StyleGAN2-ADA. The network was trained on intraoperative high-definition (HD) endoscopic images of tympanic membranes collected from pediatric patients undergoing myringotomy with possible tympanostomy tube placement. A human validation survey was administered to a cohort of OHNS and pediatrics trainees at our institution. The primary measure of model quality was the Frechet Inception Distance (FID), a metric comparing the distribution of generated images with the distribution of real images. The measures used for human reviewer validation were the sensitivity, specificity, and area under the curve (AUC) for humans' ability to discern synthetic from real images. Our dataset comprised 202 images. The best GAN was trained at 512x512 image resolution with a FID of 47.0. The progression of images through training showed stepwise "learning" of the anatomic features of a tympanic membrane. The validation survey was taken by 65 persons who reviewed 925 images. Human reviewers demonstrated a sensitivity of 66%, specificity of 73%, and AUC of 0.69 for the detection of synthetic images. In summary, we successfully developed a GAN to produce synthetic tympanic membrane images and validated this with human reviewers. These images could be used to bolster real datasets with various pathologies and develop more robust deep learning models such as those used for diagnostic predictions from otoscopic images. However, caution should be exercised with the use of synthetic data given issues regarding data diversity and performance validation. Any model trained using synthetic data will require robust external validation to ensure validity and generalizability.
Collapse
Affiliation(s)
- Krish Suresh
- Department of Otolaryngology-Head & Neck Surgery, Massachusetts Eye & Ear, Boston, Massachusetts, United States of America
- Department of Otolaryngology-Head & Neck Surgery, Harvard Medical School, Boston, Massachusetts, United States of America
- * E-mail:
| | - Michael S. Cohen
- Department of Otolaryngology-Head & Neck Surgery, Massachusetts Eye & Ear, Boston, Massachusetts, United States of America
- Department of Otolaryngology-Head & Neck Surgery, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Christopher J. Hartnick
- Department of Otolaryngology-Head & Neck Surgery, Massachusetts Eye & Ear, Boston, Massachusetts, United States of America
- Department of Otolaryngology-Head & Neck Surgery, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Ryan A. Bartholomew
- Department of Otolaryngology-Head & Neck Surgery, Massachusetts Eye & Ear, Boston, Massachusetts, United States of America
- Department of Otolaryngology-Head & Neck Surgery, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Daniel J. Lee
- Department of Otolaryngology-Head & Neck Surgery, Massachusetts Eye & Ear, Boston, Massachusetts, United States of America
- Department of Otolaryngology-Head & Neck Surgery, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Matthew G. Crowson
- Department of Otolaryngology-Head & Neck Surgery, Massachusetts Eye & Ear, Boston, Massachusetts, United States of America
- Department of Otolaryngology-Head & Neck Surgery, Harvard Medical School, Boston, Massachusetts, United States of America
| |
Collapse
|
69
|
Zhao Y, Wang X, Che T, Bao G, Li S. Multi-task deep learning for medical image computing and analysis: A review. Comput Biol Med 2023; 153:106496. [PMID: 36634599 DOI: 10.1016/j.compbiomed.2022.106496] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 12/06/2022] [Accepted: 12/27/2022] [Indexed: 12/29/2022]
Abstract
The renaissance of deep learning has provided promising solutions to various tasks. While conventional deep learning models are constructed for a single specific task, multi-task deep learning (MTDL) that is capable to simultaneously accomplish at least two tasks has attracted research attention. MTDL is a joint learning paradigm that harnesses the inherent correlation of multiple related tasks to achieve reciprocal benefits in improving performance, enhancing generalizability, and reducing the overall computational cost. This review focuses on the advanced applications of MTDL for medical image computing and analysis. We first summarize four popular MTDL network architectures (i.e., cascaded, parallel, interacted, and hybrid). Then, we review the representative MTDL-based networks for eight application areas, including the brain, eye, chest, cardiac, abdomen, musculoskeletal, pathology, and other human body regions. While MTDL-based medical image processing has been flourishing and demonstrating outstanding performance in many tasks, in the meanwhile, there are performance gaps in some tasks, and accordingly we perceive the open challenges and the perspective trends. For instance, in the 2018 Ischemic Stroke Lesion Segmentation challenge, the reported top dice score of 0.51 and top recall of 0.55 achieved by the cascaded MTDL model indicate further research efforts in high demand to escalate the performance of current models.
Collapse
Affiliation(s)
- Yan Zhao
- Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, 100083, China
| | - Xiuying Wang
- School of Computer Science, The University of Sydney, Sydney, NSW, 2008, Australia.
| | - Tongtong Che
- Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, 100083, China
| | - Guoqing Bao
- School of Computer Science, The University of Sydney, Sydney, NSW, 2008, Australia
| | - Shuyu Li
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, 100875, China.
| |
Collapse
|
70
|
Chen C, Raymond C, Speier W, Jin X, Cloughesy TF, Enzmann D, Ellingson BM, Arnold CW. Synthesizing MR Image Contrast Enhancement Using 3D High-Resolution ConvNets. IEEE Trans Biomed Eng 2023; 70:401-412. [PMID: 35853075 PMCID: PMC9928432 DOI: 10.1109/tbme.2022.3192309] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
OBJECTIVE Gadolinium-based contrast agents (GBCAs) have been widely used to better visualize disease in brain magnetic resonance imaging (MRI). However, gadolinium deposition within the brain and body has raised safety concerns about the use of GBCAs. Therefore, the development of novel approaches that can decrease or even eliminate GBCA exposure while providing similar contrast information would be of significant use clinically. METHODS In this work, we present a deep learning based approach for contrast-enhanced T1 synthesis on brain tumor patients. A 3D high-resolution fully convolutional network (FCN), which maintains high resolution information through processing and aggregates multi-scale information in parallel, is designed to map pre-contrast MRI sequences to contrast-enhanced MRI sequences. Specifically, three pre-contrast MRI sequences, T1, T2 and apparent diffusion coefficient map (ADC), are utilized as inputs and the post-contrast T1 sequences are utilized as target output. To alleviate the data imbalance problem between normal tissues and the tumor regions, we introduce a local loss to improve the contribution of the tumor regions, which leads to better enhancement results on tumors. RESULTS Extensive quantitative and visual assessments are performed, with our proposed model achieving a PSNR of 28.24 dB in the brain and 21.2 dB in tumor regions. CONCLUSION AND SIGNIFICANCE Our results suggest the potential of substituting GBCAs with synthetic contrast images generated via deep learning.
Collapse
|
71
|
Osuala R, Kushibar K, Garrucho L, Linardos A, Szafranowska Z, Klein S, Glocker B, Diaz O, Lekadir K. Data synthesis and adversarial networks: A review and meta-analysis in cancer imaging. Med Image Anal 2023; 84:102704. [PMID: 36473414 DOI: 10.1016/j.media.2022.102704] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 11/02/2022] [Accepted: 11/21/2022] [Indexed: 11/26/2022]
Abstract
Despite technological and medical advances, the detection, interpretation, and treatment of cancer based on imaging data continue to pose significant challenges. These include inter-observer variability, class imbalance, dataset shifts, inter- and intra-tumour heterogeneity, malignancy determination, and treatment effect uncertainty. Given the recent advancements in image synthesis, Generative Adversarial Networks (GANs), and adversarial training, we assess the potential of these technologies to address a number of key challenges of cancer imaging. We categorise these challenges into (a) data scarcity and imbalance, (b) data access and privacy, (c) data annotation and segmentation, (d) cancer detection and diagnosis, and (e) tumour profiling, treatment planning and monitoring. Based on our analysis of 164 publications that apply adversarial training techniques in the context of cancer imaging, we highlight multiple underexplored solutions with research potential. We further contribute the Synthesis Study Trustworthiness Test (SynTRUST), a meta-analysis framework for assessing the validation rigour of medical image synthesis studies. SynTRUST is based on 26 concrete measures of thoroughness, reproducibility, usefulness, scalability, and tenability. Based on SynTRUST, we analyse 16 of the most promising cancer imaging challenge solutions and observe a high validation rigour in general, but also several desirable improvements. With this work, we strive to bridge the gap between the needs of the clinical cancer imaging community and the current and prospective research on data synthesis and adversarial networks in the artificial intelligence community.
Collapse
Affiliation(s)
- Richard Osuala
- Artificial Intelligence in Medicine Lab (BCN-AIM), Facultat de Matemàtiques i Informàtica, Universitat de Barcelona, Spain.
| | - Kaisar Kushibar
- Artificial Intelligence in Medicine Lab (BCN-AIM), Facultat de Matemàtiques i Informàtica, Universitat de Barcelona, Spain
| | - Lidia Garrucho
- Artificial Intelligence in Medicine Lab (BCN-AIM), Facultat de Matemàtiques i Informàtica, Universitat de Barcelona, Spain
| | - Akis Linardos
- Artificial Intelligence in Medicine Lab (BCN-AIM), Facultat de Matemàtiques i Informàtica, Universitat de Barcelona, Spain
| | - Zuzanna Szafranowska
- Artificial Intelligence in Medicine Lab (BCN-AIM), Facultat de Matemàtiques i Informàtica, Universitat de Barcelona, Spain
| | - Stefan Klein
- Biomedical Imaging Group Rotterdam, Department of Radiology & Nuclear Medicine, Erasmus MC, Rotterdam, The Netherlands
| | - Ben Glocker
- Biomedical Image Analysis Group, Department of Computing, Imperial College London, UK
| | - Oliver Diaz
- Artificial Intelligence in Medicine Lab (BCN-AIM), Facultat de Matemàtiques i Informàtica, Universitat de Barcelona, Spain
| | - Karim Lekadir
- Artificial Intelligence in Medicine Lab (BCN-AIM), Facultat de Matemàtiques i Informàtica, Universitat de Barcelona, Spain
| |
Collapse
|
72
|
Cai H, Gao Y, Liu M. Graph Transformer Geometric Learning of Brain Networks Using Multimodal MR Images for Brain Age Estimation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:456-466. [PMID: 36374874 DOI: 10.1109/tmi.2022.3222093] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Brain age is considered as an important biomarker for detecting aging-related diseases such as Alzheimer's Disease (AD). Magnetic resonance imaging (MRI) have been widely investigated with deep neural networks for brain age estimation. However, most existing methods cannot make full use of multimodal MRIs due to the difference in data structure. In this paper, we propose a graph transformer geometric learning framework to model the multimodal brain network constructed by structural MRI (sMRI) and diffusion tensor imaging (DTI) for brain age estimation. First, we build a two-stream convolutional autoencoder to learn the latent representations for each imaging modality. The brain template with prior knowledge is utilized to calculate the features from the regions of interest (ROIs). Then, a multi-level construction of the brain network is proposed to establish the hybrid ROI connections in space, feature and modality. Next, a graph transformer network is proposed to model the cross-modal interaction and fusion by geometric learning for brain age estimation. Finally, the difference between the estimated age and the chronological age is used as an important biomarker for AD diagnosis. Our method is evaluated with the sMRI and DTI data from UK Biobank and Alzheimer's Disease Neuroimaging Initiative database. Experimental results demonstrate that our method has achieved promising performances for brain age estimation and AD diagnosis.
Collapse
|
73
|
Yurt M, Dalmaz O, Dar S, Ozbey M, Tinaz B, Oguz K, Cukur T. Semi-Supervised Learning of MRI Synthesis Without Fully-Sampled Ground Truths. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:3895-3906. [PMID: 35969576 DOI: 10.1109/tmi.2022.3199155] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Learning-based translation between MRI contrasts involves supervised deep models trained using high-quality source- and target-contrast images derived from fully-sampled acquisitions, which might be difficult to collect under limitations on scan costs or time. To facilitate curation of training sets, here we introduce the first semi-supervised model for MRI contrast translation (ssGAN) that can be trained directly using undersampled k-space data. To enable semi-supervised learning on undersampled data, ssGAN introduces novel multi-coil losses in image, k-space, and adversarial domains. The multi-coil losses are selectively enforced on acquired k-space samples unlike traditional losses in single-coil synthesis models. Comprehensive experiments on retrospectively undersampled multi-contrast brain MRI datasets are provided. Our results demonstrate that ssGAN yields on par performance to a supervised model, while outperforming single-coil models trained on coil-combined magnitude images. It also outperforms cascaded reconstruction-synthesis models where a supervised synthesis model is trained following self-supervised reconstruction of undersampled data. Thus, ssGAN holds great promise to improve the feasibility of learning-based multi-contrast MRI synthesis.
Collapse
|
74
|
Kim E, Cho HH, Kwon J, Oh YT, Ko ES, Park H. Tumor-Attentive Segmentation-Guided GAN for Synthesizing Breast Contrast-Enhanced MRI Without Contrast Agents. IEEE JOURNAL OF TRANSLATIONAL ENGINEERING IN HEALTH AND MEDICINE 2022; 11:32-43. [PMID: 36478773 PMCID: PMC9721354 DOI: 10.1109/jtehm.2022.3221918] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 10/25/2022] [Accepted: 11/10/2022] [Indexed: 11/16/2022]
Abstract
OBJECTIVE Breast dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) is a sensitive imaging technique critical for breast cancer diagnosis. However, the administration of contrast agents poses a potential risk. This can be avoided if contrast-enhanced MRI can be obtained without using contrast agents. Thus, we aimed to generate T1-weighted contrast-enhanced MRI (ceT1) images from pre-contrast T1 weighted MRI (preT1) images in the breast. METHODS We proposed a generative adversarial network to synthesize ceT1 from preT1 breast images that adopted a local discriminator and segmentation task network to focus specifically on the tumor region in addition to the whole breast. The segmentation network performed a related task of segmentation of the tumor region, which allowed important tumor-related information to be enhanced. In addition, edge maps were included to provide explicit shape and structural information. Our approach was evaluated and compared with other methods in the local (n = 306) and external validation (n = 140) cohorts. Four evaluation metrics of normalized mean squared error (NRMSE), Pearson cross-correlation coefficients (CC), peak signal-to-noise ratio (PSNR), and structural similarity index map (SSIM) for the whole breast and tumor region were measured. An ablation study was performed to evaluate the incremental benefits of various components in our approach. RESULTS Our approach performed the best with an NRMSE 25.65, PSNR 54.80 dB, SSIM 0.91, and CC 0.88 on average, in the local test set. CONCLUSION Performance gains were replicated in the validation cohort. SIGNIFICANCE We hope that our method will help patients avoid potentially harmful contrast agents. Clinical and Translational Impact Statement-Contrast agents are necessary to obtain DCE-MRI which is essential in breast cancer diagnosis. However, administration of contrast agents may cause side effects such as nephrogenic systemic fibrosis and risk of toxic residue deposits. Our approach can generate DCE-MRI without contrast agents using a generative deep neural network. Thus, our approach could help patients avoid potentially harmful contrast agents resulting in an improved diagnosis and treatment workflow for breast cancer.
Collapse
Affiliation(s)
- Eunjin Kim
- Department of Electrical and Computer EngineeringSungkyunkwan UniversitySuwon16419South Korea
| | - Hwan-Ho Cho
- Department of Electrical and Computer EngineeringSungkyunkwan UniversitySuwon16419South Korea
- Department of Medical Aritifical IntelligenceKonyang UniversityDaejon35365South Korea
| | - Junmo Kwon
- Department of Electrical and Computer EngineeringSungkyunkwan UniversitySuwon16419South Korea
| | - Young-Tack Oh
- Department of Electrical and Computer EngineeringSungkyunkwan UniversitySuwon16419South Korea
| | - Eun Sook Ko
- Samsung Medical CenterDepartment of Radiology, School of MedicineSungkyunkwan UniversitySeoul06351South Korea
| | - Hyunjin Park
- School of Electronic and Electrical EngineeringSungkyunkwan UniversitySuwon16419South Korea
- Center for Neuroscience Imaging ResearchInstitute for Basic ScienceSuwon16419South Korea
| |
Collapse
|
75
|
Yang Q, Guo X, Chen Z, Woo PYM, Yuan Y. D 2-Net: Dual Disentanglement Network for Brain Tumor Segmentation With Missing Modalities. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:2953-2964. [PMID: 35576425 DOI: 10.1109/tmi.2022.3175478] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Multi-modal Magnetic Resonance Imaging (MRI) can provide complementary information for automatic brain tumor segmentation, which is crucial for diagnosis and prognosis. While missing modality data is common in clinical practice and it can result in the collapse of most previous methods relying on complete modality data. Current state-of-the-art approaches cope with the situations of missing modalities by fusing multi-modal images and features to learn shared representations of tumor regions, which often ignore explicitly capturing the correlations among modalities and tumor regions. Inspired by the fact that modality information plays distinct roles to segment different tumor regions, we aim to explicitly exploit the correlations among various modality-specific information and tumor-specific knowledge for segmentation. To this end, we propose a Dual Disentanglement Network (D2-Net) for brain tumor segmentation with missing modalities, which consists of a modality disentanglement stage (MD-Stage) and a tumor-region disentanglement stage (TD-Stage). In the MD-Stage, a spatial-frequency joint modality contrastive learning scheme is designed to directly decouple the modality-specific information from MRI data. To decompose tumor-specific representations and extract discriminative holistic features, we propose an affinity-guided dense tumor-region knowledge distillation mechanism in the TD-Stage through aligning the features of a disentangled binary teacher network with a holistic student network. By explicitly discovering relations among modalities and tumor regions, our model can learn sufficient information for segmentation even if some modalities are missing. Extensive experiments on the public BraTS-2018 database demonstrate the superiority of our framework over state-of-the-art methods in missing modalities situations. Codes are available at https://github.com/CityU-AIM-Group/D2Net.
Collapse
|
76
|
Dalmaz O, Yurt M, Cukur T. ResViT: Residual Vision Transformers for Multimodal Medical Image Synthesis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:2598-2614. [PMID: 35436184 DOI: 10.1109/tmi.2022.3167808] [Citation(s) in RCA: 122] [Impact Index Per Article: 40.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Generative adversarial models with convolutional neural network (CNN) backbones have recently been established as state-of-the-art in numerous medical image synthesis tasks. However, CNNs are designed to perform local processing with compact filters, and this inductive bias compromises learning of contextual features. Here, we propose a novel generative adversarial approach for medical image synthesis, ResViT, that leverages the contextual sensitivity of vision transformers along with the precision of convolution operators and realism of adversarial learning. ResViT's generator employs a central bottleneck comprising novel aggregated residual transformer (ART) blocks that synergistically combine residual convolutional and transformer modules. Residual connections in ART blocks promote diversity in captured representations, while a channel compression module distills task-relevant information. A weight sharing strategy is introduced among ART blocks to mitigate computational burden. A unified implementation is introduced to avoid the need to rebuild separate synthesis models for varying source-target modality configurations. Comprehensive demonstrations are performed for synthesizing missing sequences in multi-contrast MRI, and CT images from MRI. Our results indicate superiority of ResViT against competing CNN- and transformer-based methods in terms of qualitative observations and quantitative metrics.
Collapse
|
77
|
Valencia L, Clèrigues A, Valverde S, Salem M, Oliver A, Rovira À, Lladó X. Evaluating the use of synthetic T1-w images in new T2 lesion detection in multiple sclerosis. Front Neurosci 2022; 16:954662. [PMID: 36248650 PMCID: PMC9558286 DOI: 10.3389/fnins.2022.954662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 08/30/2022] [Indexed: 11/23/2022] Open
Abstract
The assessment of disease activity using serial brain MRI scans is one of the most valuable strategies for monitoring treatment response in patients with multiple sclerosis (MS) receiving disease-modifying treatments. Recently, several deep learning approaches have been proposed to improve this analysis, obtaining a good trade-off between sensitivity and specificity, especially when using T1-w and T2-FLAIR images as inputs. However, the need to acquire two different types of images is time-consuming, costly and not always available in clinical practice. In this paper, we investigate an approach to generate synthetic T1-w images from T2-FLAIR images and subsequently analyse the impact of using original and synthetic T1-w images on the performance of a state-of-the-art approach for longitudinal MS lesion detection. We evaluate our approach on a dataset containing 136 images from MS patients, and 73 images with lesion activity (the appearance of new T2 lesions in follow-up scans). To evaluate the synthesis of the images, we analyse the structural similarity index metric and the median absolute error and obtain consistent results. To study the impact of synthetic T1-w images, we evaluate the performance of the new lesion detection approach when using (1) both T2-FLAIR and T1-w original images, (2) only T2-FLAIR images, and (3) both T2-FLAIR and synthetic T1-w images. Sensitivities of 0.75, 0.63, and 0.81, respectively, were obtained at the same false-positive rate (0.14) for all experiments. In addition, we also present the results obtained when using the data from the international MSSEG-2 challenge, showing also an improvement when including synthetic T1-w images. In conclusion, we show that the use of synthetic images can support the lack of data or even be used instead of the original image to homogenize the contrast of the different acquisitions in new T2 lesions detection algorithms.
Collapse
Affiliation(s)
- Liliana Valencia
- Research Institute of Computer Vision and Robotics, University of Girona, Girona, Spain
| | - Albert Clèrigues
- Research Institute of Computer Vision and Robotics, University of Girona, Girona, Spain
| | | | - Mostafa Salem
- Research Institute of Computer Vision and Robotics, University of Girona, Girona, Spain
- Department of Computer Science, Faculty of Computers and Information, Assiut University, Asyut, Egypt
| | - Arnau Oliver
- Research Institute of Computer Vision and Robotics, University of Girona, Girona, Spain
| | - Àlex Rovira
- Magnetic Resonance Unit, Department of Radiology, Vall d'Hebron University Hospital, Barcelona, Spain
| | - Xavier Lladó
- Research Institute of Computer Vision and Robotics, University of Girona, Girona, Spain
| |
Collapse
|
78
|
He S, Feng Y, Grant PE, Ou Y. Deep Relation Learning for Regression and Its Application to Brain Age Estimation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:2304-2317. [PMID: 35320092 PMCID: PMC9782832 DOI: 10.1109/tmi.2022.3161739] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Most deep learning models for temporal regression directly output the estimation based on single input images, ignoring the relationships between different images. In this paper, we propose deep relation learning for regression, aiming to learn different relations between a pair of input images. Four non-linear relations are considered: "cumulative relation," "relative relation," "maximal relation" and "minimal relation." These four relations are learned simultaneously from one deep neural network which has two parts: feature extraction and relation regression. We use an efficient convolutional neural network to extract deep features from the pair of input images and apply a Transformer for relation learning. The proposed method is evaluated on a merged dataset with 6,049 subjects with ages of 0-97 years using 5-fold cross-validation for the task of brain age estimation. The experimental results have shown that the proposed method achieved a mean absolute error (MAE) of 2.38 years, which is lower than the MAEs of 8 other state-of-the-art algorithms with statistical significance (p<0.05) in paired T-test (two-side).
Collapse
|
79
|
Zhou Q, Zou H. A layer-wise fusion network incorporating self-supervised learning for multimodal MR image synthesis. Front Genet 2022; 13:937042. [PMID: 36017492 PMCID: PMC9396279 DOI: 10.3389/fgene.2022.937042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Accepted: 07/04/2022] [Indexed: 11/13/2022] Open
Abstract
Magnetic resonance (MR) imaging plays an important role in medical diagnosis and treatment; different modalities of MR images can provide rich and complementary information to improve the accuracy of diagnosis. However, due to the limitations of scanning time and medical conditions, certain modalities of MR may be unavailable or of low quality in clinical practice. In this study, we propose a new multimodal MR image synthesis network to generate missing MR images. The proposed model comprises three stages: feature extraction, feature fusion, and image generation. During feature extraction, 2D and 3D self-supervised pretext tasks are introduced to pre-train the backbone for better representations of each modality. Then, a channel attention mechanism is used when fusing features so that the network can adaptively weigh different fusion operations to learn common representations of all modalities. Finally, a generative adversarial network is considered as the basic framework to generate images, in which a feature-level edge information loss is combined with the pixel-wise loss to ensure consistency between the synthesized and real images in terms of anatomical characteristics. 2D and 3D self-supervised pre-training can have better performance on feature extraction to retain more details in the synthetic images. Moreover, the proposed multimodal attention feature fusion block (MAFFB) in the well-designed layer-wise fusion strategy can model both common and unique information in all modalities, consistent with the clinical analysis. We also perform an interpretability analysis to confirm the rationality and effectiveness of our method. The experimental results demonstrate that our method can be applied in both single-modal and multimodal synthesis with high robustness and outperforms other state-of-the-art approaches objectively and subjectively.
Collapse
|
80
|
Yan S, Wang C, Chen W, Lyu J. Swin transformer-based GAN for multi-modal medical image translation. Front Oncol 2022; 12:942511. [PMID: 36003791 PMCID: PMC9395186 DOI: 10.3389/fonc.2022.942511] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 07/11/2022] [Indexed: 11/13/2022] Open
Abstract
Medical image-to-image translation is considered a new direction with many potential applications in the medical field. The medical image-to-image translation is dominated by two models, including supervised Pix2Pix and unsupervised cyclic-consistency generative adversarial network (GAN). However, existing methods still have two shortcomings: 1) the Pix2Pix requires paired and pixel-aligned images, which are difficult to acquire. Nevertheless, the optimum output of the cycle-consistency model may not be unique. 2) They are still deficient in capturing the global features and modeling long-distance interactions, which are critical for regions with complex anatomical structures. We propose a Swin Transformer-based GAN for Multi-Modal Medical Image Translation, named MMTrans. Specifically, MMTrans consists of a generator, a registration network, and a discriminator. The Swin Transformer-based generator enables to generate images with the same content as source modality images and similar style information of target modality images. The encoder part of the registration network, based on Swin Transformer, is utilized to predict deformable vector fields. The convolution-based discriminator determines whether the target modality images are similar to the generator or from the real images. Extensive experiments conducted using the public dataset and clinical datasets showed that our network outperformed other advanced medical image translation methods in both aligned and unpaired datasets and has great potential to be applied in clinical applications.
Collapse
Affiliation(s)
- Shouang Yan
- School of Computer and Control Engineering, Yantai University, Yantai, China
| | - Chengyan Wang
- Human Phenome Institute, Fudan University, Shanghai, China
| | | | - Jun Lyu
- School of Computer and Control Engineering, Yantai University, Yantai, China
| |
Collapse
|
81
|
Zhang Y, Zhou T, Wu W, Xie H, Zhu H, Zhou G, Cichocki A. Improving EEG Decoding via Clustering-Based Multitask Feature Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:3587-3597. [PMID: 33556021 DOI: 10.1109/tnnls.2021.3053576] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Accurate electroencephalogram (EEG) pattern decoding for specific mental tasks is one of the key steps for the development of brain-computer interface (BCI), which is quite challenging due to the considerably low signal-to-noise ratio of EEG collected at the brain scalp. Machine learning provides a promising technique to optimize EEG patterns toward better decoding accuracy. However, existing algorithms do not effectively explore the underlying data structure capturing the true EEG sample distribution and, hence, can only yield a suboptimal decoding accuracy. To uncover the intrinsic distribution structure of EEG data, we propose a clustering-based multitask feature learning algorithm for improved EEG pattern decoding. Specifically, we perform affinity propagation-based clustering to explore the subclasses (i.e., clusters) in each of the original classes and then assign each subclass a unique label based on a one-versus-all encoding strategy. With the encoded label matrix, we devise a novel multitask learning algorithm by exploiting the subclass relationship to jointly optimize the EEG pattern features from the uncovered subclasses. We then train a linear support vector machine with the optimized features for EEG pattern decoding. Extensive experimental studies are conducted on three EEG data sets to validate the effectiveness of our algorithm in comparison with other state-of-the-art approaches. The improved experimental results demonstrate the outstanding superiority of our algorithm, suggesting its prominent performance for EEG pattern decoding in BCI applications.
Collapse
|
82
|
Gao J, Zhao W, Li P, Huang W, Chen Z. LEGAN: A Light and Effective Generative Adversarial Network for medical image synthesis. Comput Biol Med 2022; 148:105878. [PMID: 35863249 DOI: 10.1016/j.compbiomed.2022.105878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Revised: 06/21/2022] [Accepted: 07/09/2022] [Indexed: 11/28/2022]
Abstract
Medical image synthesis plays an important role in clinical diagnosis by providing auxiliary pathological information. However, previous methods usually utilize the one-step strategy designed for wild image synthesis, which are not sensitive to local details of tissues within medical images. In addition, these methods consume a great number of computing resources in generating medical images, which seriously limits their applicability in clinical diagnosis. To address the above issues, a Light and Effective Generative Adversarial Network (LEGAN) is proposed to generate high-fidelity medical images in a lightweight manner. In particular, a coarse-to-fine paradigm is designed to imitate the painting process of humans for medical image synthesis within a two-stage generative adversarial network, which guarantees the sensitivity to local information of medical images. Furthermore, a low-rank convolutional layer is introduced to construct LEGAN for lightweight medical image synthesis, which utilizes principal components of full-rank convolutional kernels to reduce model redundancy. Additionally, a multi-stage mutual information distillation is devised to maximize dependencies of distributions between generated and real medical images in model training. Finally, extensive experiments are conducted in two typical tasks, i.e., retinal fundus image synthesis and proton density weighted MR image synthesis. The results demonstrate that LEGAN outperforms the comparison methods by a significant margin in terms of Fréchet inception distance (FID) and Number of parameters (NoP).
Collapse
Affiliation(s)
- Jing Gao
- School of Software Technology, Dalian University of Technology, Economic and Technological Development Zone Tuqiang Street No. 321, Dalian, 116620, Liaoning, China; Key Laboratory for Ubiquitous Network and Service Software of Liaoning, Economic and Technological Development Zone Tuqiang Street No. 321, Dalian, 116620, Liaoning, China
| | - Wenhan Zhao
- School of Software Technology, Dalian University of Technology, Economic and Technological Development Zone Tuqiang Street No. 321, Dalian, 116620, Liaoning, China
| | - Peng Li
- School of Software Technology, Dalian University of Technology, Economic and Technological Development Zone Tuqiang Street No. 321, Dalian, 116620, Liaoning, China.
| | - Wei Huang
- Department of Scientifc Research, First Affiliated Hospital of Dalian Medical University, Zhongshan Road No. 222, Dalian, 116012, Liaoning, China.
| | - Zhikui Chen
- School of Software Technology, Dalian University of Technology, Economic and Technological Development Zone Tuqiang Street No. 321, Dalian, 116620, Liaoning, China; Key Laboratory for Ubiquitous Network and Service Software of Liaoning, Economic and Technological Development Zone Tuqiang Street No. 321, Dalian, 116620, Liaoning, China
| |
Collapse
|
83
|
FDG-PET to T1 Weighted MRI Translation with 3D Elicit Generative Adversarial Network (E-GAN). SENSORS 2022; 22:s22124640. [PMID: 35746422 PMCID: PMC9227640 DOI: 10.3390/s22124640] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Revised: 05/30/2022] [Accepted: 06/07/2022] [Indexed: 02/01/2023]
Abstract
Objective: With the strengths of deep learning, computer-aided diagnosis (CAD) is a hot topic for researchers in medical image analysis. One of the main requirements for training a deep learning model is providing enough data for the network. However, in medical images, due to the difficulties of data collection and data privacy, finding an appropriate dataset (balanced, enough samples, etc.) is quite a challenge. Although image synthesis could be beneficial to overcome this issue, synthesizing 3D images is a hard task. The main objective of this paper is to generate 3D T1 weighted MRI corresponding to FDG-PET. In this study, we propose a separable convolution-based Elicit generative adversarial network (E-GAN). The proposed architecture can reconstruct 3D T1 weighted MRI from 2D high-level features and geometrical information retrieved from a Sobel filter. Experimental results on the ADNI datasets for healthy subjects show that the proposed model improves the quality of images compared with the state of the art. In addition, the evaluation of E-GAN and the state of art methods gives a better result on the structural information (13.73% improvement for PSNR and 22.95% for SSIM compared to Pix2Pix GAN) and textural information (6.9% improvements for homogeneity error in Haralick features compared to Pix2Pix GAN).
Collapse
|
84
|
Gupta M, Kumar N, Gupta N, Zaguia A. Fusion of multi-modality biomedical images using deep neural networks. Soft comput 2022. [DOI: 10.1007/s00500-022-07047-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
85
|
Osman AFI, Tamam NM. Deep learning-based convolutional neural network for intramodality brain MRI synthesis. J Appl Clin Med Phys 2022; 23:e13530. [PMID: 35044073 PMCID: PMC8992958 DOI: 10.1002/acm2.13530] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2021] [Revised: 12/22/2021] [Accepted: 12/25/2021] [Indexed: 12/16/2022] Open
Abstract
PURPOSE The existence of multicontrast magnetic resonance (MR) images increases the level of clinical information available for the diagnosis and treatment of brain cancer patients. However, acquiring the complete set of multicontrast MR images is not always practically feasible. In this study, we developed a state-of-the-art deep learning convolutional neural network (CNN) for image-to-image translation across three standards MRI contrasts for the brain. METHODS BRATS'2018 MRI dataset of 477 patients clinically diagnosed with glioma brain cancer was used in this study, with each patient having T1-weighted (T1), T2-weighted (T2), and FLAIR contrasts. It was randomly split into 64%, 16%, and 20% as training, validation, and test set, respectively. We developed a U-Net model to learn the nonlinear mapping of a source image contrast to a target image contrast across three MRI contrasts. The model was trained and validated with 2D paired MR images using a mean-squared error (MSE) cost function, Adam optimizer with 0.001 learning rate, and 120 epochs with a batch size of 32. The generated synthetic-MR images were evaluated against the ground-truth images by computing the MSE, mean absolute error (MAE), peak signal-to-noise ratio (PSNR), and structural similarity index (SSIM). RESULTS The generated synthetic-MR images with our model were nearly indistinguishable from the real images on the testing dataset for all translations, except synthetic FLAIR images had slightly lower quality and exhibited loss of details. The range of average PSNR, MSE, MAE, and SSIM values over the six translations were 29.44-33.25 dB, 0.0005-0.0012, 0.0086-0.0149, and 0.932-0.946, respectively. Our results were as good as the best-reported results by other deep learning models on BRATS datasets. CONCLUSIONS Our U-Net model exhibited that it can accurately perform image-to-image translation across brain MRI contrasts. It could hold great promise for clinical use for improved clinical decision-making and better diagnosis of brain cancer patients due to the availability of multicontrast MRIs. This approach may be clinically relevant and setting a significant step to efficiently fill a gap of absent MR sequences without additional scanning.
Collapse
Affiliation(s)
- Alexander F I Osman
- Department of Medical Physics, Al-Neelain University, Khartoum, 11121, Sudan
| | - Nissren M Tamam
- Department of Physics, College of Science, Princess Nourah bint Abdulrahman University, P. O. Box 84428, Riyadh, 11671, Saudi Arabia
| |
Collapse
|
86
|
Zhu X, Wu Y, Hu H, Zhuang X, Yao J, Ou D, Li W, Song M, Feng N, Xu D. Medical lesion segmentation by combining multi‐modal images with modality weighted UNet. Med Phys 2022; 49:3692-3704. [PMID: 35312077 DOI: 10.1002/mp.15610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 02/25/2022] [Accepted: 03/04/2022] [Indexed: 11/09/2022] Open
Affiliation(s)
- Xiner Zhu
- College of Information Science and Electronic Engineering Zhejiang University Hangzhou China
| | - Yichao Wu
- College of Information Science and Electronic Engineering Zhejiang University Hangzhou China
| | - Haoji Hu
- College of Information Science and Electronic Engineering Zhejiang University Hangzhou China
| | - Xianwei Zhuang
- College of Information Science and Electronic Engineering Zhejiang University Hangzhou China
| | - Jincao Yao
- Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital) Hangzhou China
- Institute of Basic Medicine and Cancer (IBMC) Chinese Academy of Sciences Hangzhou China
| | - Di Ou
- Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital) Hangzhou China
- Institute of Basic Medicine and Cancer (IBMC) Chinese Academy of Sciences Hangzhou China
| | - Wei Li
- Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital) Hangzhou China
- Institute of Basic Medicine and Cancer (IBMC) Chinese Academy of Sciences Hangzhou China
| | - Mei Song
- Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital) Hangzhou China
- Institute of Basic Medicine and Cancer (IBMC) Chinese Academy of Sciences Hangzhou China
| | - Na Feng
- Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital) Hangzhou China
- Institute of Basic Medicine and Cancer (IBMC) Chinese Academy of Sciences Hangzhou China
| | - Dong Xu
- Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital) Hangzhou China
- Institute of Basic Medicine and Cancer (IBMC) Chinese Academy of Sciences Hangzhou China
| |
Collapse
|
87
|
Gao W, Wang W, Song D, Wang K, Lian D, Yang C, Zhu K, Zheng J, Zeng M, Rao S, Wang M. A
Multiparametric
Fusion Deep Learning Model Based on
DCE‐MRI
for Preoperative Prediction of Microvascular Invasion in Intrahepatic Cholangiocarcinoma. J Magn Reson Imaging 2022; 56:1029-1039. [PMID: 35191550 DOI: 10.1002/jmri.28126] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2021] [Revised: 02/11/2022] [Accepted: 02/11/2022] [Indexed: 12/22/2022] Open
Affiliation(s)
- Wenyu Gao
- Digital Medical Research Center School of Basic Medical Sciences, Fudan University Shanghai 200032 China
- Shanghai Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention Shanghai 200032 China
| | - Wentao Wang
- Department of Radiology Cancer center, Zhongshan Hospital, Fudan University China
- Shanghai Institute of Medical Imaging Shanghai China
| | - Danjun Song
- Liver Cancer Institute, Zhongshan Hospital, Fudan University Shanghai China
- Department of Interventional Radiology Zhejiang Cancer Hospital Hangzhou Zhejiang China
| | - Kang Wang
- Digital Medical Research Center School of Basic Medical Sciences, Fudan University Shanghai 200032 China
- Shanghai Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention Shanghai 200032 China
| | - Danlan Lian
- Department of Radiology Xiamen Branch, Zhongshan Hospital, Fudan University Xiamen China
| | - Chun Yang
- Department of Radiology Cancer center, Zhongshan Hospital, Fudan University China
| | - Kai Zhu
- Liver Cancer Institute, Zhongshan Hospital, Fudan University Shanghai China
| | - Jiaping Zheng
- Department of Interventional Radiology Zhejiang Cancer Hospital Hangzhou Zhejiang China
| | - Mengsu Zeng
- Department of Radiology Cancer center, Zhongshan Hospital, Fudan University China
- Shanghai Institute of Medical Imaging Shanghai China
| | - Sheng‐xiang Rao
- Department of Radiology Cancer center, Zhongshan Hospital, Fudan University China
- Shanghai Institute of Medical Imaging Shanghai China
| | - Manning Wang
- Digital Medical Research Center School of Basic Medical Sciences, Fudan University Shanghai 200032 China
- Shanghai Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention Shanghai 200032 China
| |
Collapse
|
88
|
Gourdeau D, Duchesne S, Archambault L. On the proper use of structural similarity for the robust evaluation of medical image synthesis models. Med Phys 2022; 49:2462-2474. [PMID: 35106778 DOI: 10.1002/mp.15514] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Revised: 01/18/2022] [Accepted: 01/19/2022] [Indexed: 11/07/2022] Open
Abstract
PURPOSE To propose good practices for using the structural similarity metric (SSIM) and reporting its value. SSIM is one of the most popular image quality metrics in use in the medical image synthesis community because of its alleged superiority over voxel-by-voxel measurements like the average error or the peak signal noise ratio (PSNR). It has seen massive adoption since its introduction, but its limitations are often overlooked. Notably, SSIM is designed to work on a strictly positive intensity scale, which is generally not the case in medical imaging. Common intensity scales such as the Houndsfield units (HU) contain negative numbers, and they can also be introduced by image normalization techniques such as the z-normalization. METHODS We created a series of experiments to quantify the impact of negative values in the SSIM computation. Specifically, we trained a 3D U-Net to synthesize T2 weighted MRI from T1 weighted MRI using the BRATS 2018 dataset. SSIM was computed on the synthetic images with a shifted dynamic range. Next, to evaluate the suitability of SSIM as a loss function on images with negative values, it was used as a loss function to synthesize z-normalized images. Finally, the difference between 2D SSIM and 3D SSIM was investigated using multiple 2D U-Nets trained on different planes of the images. RESULTS The impact of the misuse of the SSIM was quantified; it was established that it introduces a large downward bias in the computed SSIM. It also introduces a small random error that can change the relative ranking of models. The exact values for this bias and error depend on the quality and the intensity histogram of the synthetic images. Although small, the reported error is significant considering the small SSIM difference between state-of-the-art models. It was shown therefore that SSIM cannot be used as a loss function when images contain negative values due to major errors in the gradient calculation, resulting in under-performing models. 2D SSIM was also found to be overestimated in 2D image synthesis models when computed along the plane of synthesis, due to the discontinuities between slices that is typical of 2D synthesis methods. CONCLUSION Various types of misuse of the SSIM were identified and their impact was quantified. Based on the findings, this paper proposes good practices when using SSIM, such as reporting the average over the volume of the image containing tissue and appropriately defining the dynamic range. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Daniel Gourdeau
- Université Laval, Department of physics, engineering physics and optics, Québec, QC, G1R 2J6, Canada.,CHUQ Cancer Research Center, Québec, QC, Canada.,CERVO Brain Research Center, Québec, QC, Canada
| | - Simon Duchesne
- CERVO Brain Research Center, Québec, QC, Canada.,Université Laval, Department of radiology, Québec, QC, G1V 0A6, Canada
| | - Louis Archambault
- Université Laval, Department of physics, engineering physics and optics, Québec, QC, G1R 2J6, Canada.,CHUQ Cancer Research Center, Québec, QC, Canada
| |
Collapse
|
89
|
Fu H, Zhou T, Li S, Frangi AF. Guest Editorial Generative Adversarial Networks in Biomedical Image Computing. IEEE J Biomed Health Inform 2022. [DOI: 10.1109/jbhi.2021.3134004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
90
|
Luo Y, Zhou L, Zhan B, Fei Y, Zhou J, Wang Y, Shen D. Adaptive rectification based adversarial network with spectrum constraint for high-quality PET image synthesis. Med Image Anal 2021; 77:102335. [PMID: 34979432 DOI: 10.1016/j.media.2021.102335] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Revised: 11/02/2021] [Accepted: 12/13/2021] [Indexed: 12/13/2022]
Abstract
Positron emission tomography (PET) is a typical nuclear imaging technique, which can provide crucial functional information for early brain disease diagnosis. Generally, clinically acceptable PET images are obtained by injecting a standard-dose radioactive tracer into human body, while on the other hand the cumulative radiation exposure inevitably raises concerns about potential health risks. However, reducing the tracer dose will increase the noise and artifacts of the reconstructed PET image. For the purpose of acquiring high-quality PET images while reducing radiation exposure, in this paper, we innovatively present an adaptive rectification based generative adversarial network with spectrum constraint, named AR-GAN, which uses low-dose PET (LPET) image to synthesize standard-dose PET (SPET) image of high-quality. Specifically, considering the existing differences between the synthesized SPET image by traditional GAN and the real SPET image, an adaptive rectification network (AR-Net) is devised to estimate the residual between the preliminarily predicted image and the real SPET image, based on the hypothesis that a more realistic rectified image can be obtained by incorporating both the residual and the preliminarily predicted PET image. Moreover, to address the issue of high-frequency distortions in the output image, we employ a spectral regularization term in the training optimization objective to constrain the consistency of the synthesized image and the real image in the frequency domain, which further preserves the high-frequency detailed information and improves synthesis performance. Validations on both the phantom dataset and the clinical dataset show that the proposed AR-GAN can estimate SPET images from LPET images effectively and outperform other state-of-the-art image synthesis approaches.
Collapse
Affiliation(s)
- Yanmei Luo
- School of Computer Science, Sichuan University, China
| | - Luping Zhou
- School of Electrical and Information Engineering, University of Sydney, Australia
| | - Bo Zhan
- School of Computer Science, Sichuan University, China
| | - Yuchen Fei
- School of Computer Science, Sichuan University, China
| | - Jiliu Zhou
- School of Computer Science, Sichuan University, China; School of Computer Science, Chengdu University of Information Technology, China
| | - Yan Wang
- School of Computer Science, Sichuan University, China.
| | - Dinggang Shen
- School of Biomedical Engineering, ShanghaiTech University, China; Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
| |
Collapse
|
91
|
Qin D, Bu JJ, Liu Z, Shen X, Zhou S, Gu JJ, Wang ZH, Wu L, Dai HF. Efficient Medical Image Segmentation Based on Knowledge Distillation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:3820-3831. [PMID: 34283713 DOI: 10.1109/tmi.2021.3098703] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Recent advances have been made in applying convolutional neural networks to achieve more precise prediction results for medical image segmentation problems. However, the success of existing methods has highly relied on huge computational complexity and massive storage, which is impractical in the real-world scenario. To deal with this problem, we propose an efficient architecture by distilling knowledge from well-trained medical image segmentation networks to train another lightweight network. This architecture empowers the lightweight network to get a significant improvement on segmentation capability while retaining its runtime efficiency. We further devise a novel distillation module tailored for medical image segmentation to transfer semantic region information from teacher to student network. It forces the student network to mimic the extent of difference of representations calculated from different tissue regions. This module avoids the ambiguous boundary problem encountered when dealing with medical imaging but instead encodes the internal information of each semantic region for transferring. Benefited from our module, the lightweight network could receive an improvement of up to 32.6% in our experiment while maintaining its portability in the inference phase. The entire structure has been verified on two widely accepted public CT datasets LiTS17 and KiTS19. We demonstrate that a lightweight network distilled by our method has non-negligible value in the scenario which requires relatively high operating speed and low storage usage.
Collapse
|
92
|
Li W, Xiao H, Li T, Ren G, Lam S, Teng X, Liu C, Zhang J, Kar-Ho Lee F, Au KH, Ho-Fun Lee V, Chang ATY, Cai J. Virtual Contrast-enhanced Magnetic Resonance Images Synthesis for Patients With Nasopharyngeal Carcinoma Using Multimodality-guided Synergistic Neural Network. Int J Radiat Oncol Biol Phys 2021; 112:1033-1044. [PMID: 34774997 DOI: 10.1016/j.ijrobp.2021.11.007] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Revised: 09/30/2021] [Accepted: 11/04/2021] [Indexed: 12/16/2022]
Abstract
PURPOSE To investigate a novel deep-learning network that synthesizes virtual contrast-enhanced T1-weighted (vceT1w) magnetic resonance images (MRI) from multimodality contrast-free MRI for patients with nasopharyngeal carcinoma (NPC). METHODS AND MATERIALS This article presents a retrospective analysis of multiparametric MRI, with and without contrast enhancement by gadolinium-based contrast agents (GBCAs), obtained from 64 biopsy-proven cases of NPC treated at Hong Kong Queen Elizabeth Hospital. A multimodality-guided synergistic neural network (MMgSN-Net) was developed to leverage complementary information between contrast-free T1-weighted and T2-weighted MRI for vceT1w MRI synthesis. Thirty-five patients were randomly selected for model training, whereas 29 patients were selected for model testing. The synthetic images generated from MMgSN-Net were quantitatively evaluated against real GBCA-enhanced T1-weighted MRI using a series of statistical evaluating metrics, which include mean absolute error (MAE), mean squared error (MSE), structural similarity index (SSIM), and peak signal-to-noise ratio (PSNR). Qualitative visual assessment between the real and synthetic MRI was also performed. Effectiveness of our MMgSN-Net was compared with 3 state-of-the-art deep-learning networks, including U-Net, CycleGAN, and Hi-Net, both quantitatively and qualitatively. Furthermore, a Turing test was performed by 7 board-certified radiation oncologists from 4 hospitals for assessing authenticity of the synthesized vceT1w MRI against the real GBCA-enhanced T1-weighted MRI. RESULTS Results from the quantitative evaluations demonstrated that our MMgSN-Net outperformed U-Net, CycleGAN and Hi-Net, yielding the top-ranked scores in averaged MAE (44.50 ± 13.01), MSE (9193.22 ± 5405.00), SSIM (0.887 ± 0.042), and PSNR (33.17 ± 2.14). Furthermore, the mean accuracy of the 7 readers in the Turing tests was determined to be 49.43%, equivalent to random guessing (ie, 50%) in distinguishing between real GBCA-enhanced T1-weighted and synthetic vceT1w MRI. Qualitative evaluation indicated that MMgSN-Net gave the best approximation to the ground-truth images, particularly in visualization of tumor-to-muscle interface and the intratumor texture information. CONCLUSIONS Our MMgSN-Net was capable of synthesizing highly realistic vceT1w MRI that outperformed the 3 comparable state-of-the-art networks.
Collapse
Affiliation(s)
- Wen Li
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Haonan Xiao
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Tian Li
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Ge Ren
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Saikit Lam
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Xinzhi Teng
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Chenyang Liu
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Jiang Zhang
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Francis Kar-Ho Lee
- Department of Clinical Oncology, Queen Elizabeth Hospital, Hong Kong SAR, China
| | - Kwok-Hung Au
- Department of Clinical Oncology, Queen Elizabeth Hospital, Hong Kong SAR, China
| | - Victor Ho-Fun Lee
- Department of Clinical Oncology, The University of Hong Kong, Hong Kong SAR, China
| | - Amy Tien Yee Chang
- Comprehensive Oncology Centre, Hong Kong Sanatorium & Hospital, Hong Kong SAR, China
| | - Jing Cai
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China.
| |
Collapse
|
93
|
Tomar D, Lortkipanidze M, Vray G, Bozorgtabar B, Thiran JP. Self-Attentive Spatial Adaptive Normalization for Cross-Modality Domain Adaptation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2021; 40:2926-2938. [PMID: 33577450 DOI: 10.1109/tmi.2021.3059265] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Despite the successes of deep neural networks on many challenging vision tasks, they often fail to generalize to new test domains that are not distributed identically to the training data. The domain adaptation becomes more challenging for cross-modality medical data with a notable domain shift. Given that specific annotated imaging modalities may not be accessible nor complete. Our proposed solution is based on the cross-modality synthesis of medical images to reduce the costly annotation burden by radiologists and bridge the domain gap in radiological images. We present a novel approach for image-to-image translation in medical images, capable of supervised or unsupervised (unpaired image data) setups. Built upon adversarial training, we propose a learnable self-attentive spatial normalization of the deep convolutional generator network's intermediate activations. Unlike previous attention-based image-to-image translation approaches, which are either domain-specific or require distortion of the source domain's structures, we unearth the importance of the auxiliary semantic information to handle the geometric changes and preserve anatomical structures during image translation. We achieve superior results for cross-modality segmentation between unpaired MRI and CT data for multi-modality whole heart and multi-modal brain tumor MRI (T1/T2) datasets compared to the state-of-the-art methods. We also observe encouraging results in cross-modality conversion for paired MRI and CT images on a brain dataset. Furthermore, a detailed analysis of the cross-modality image translation, thorough ablation studies confirm our proposed method's efficacy.
Collapse
|
94
|
Chao Z, Xu W. A New General Maximum Intensity Projection Technology via the Hybrid of U-Net and Radial Basis Function Neural Network. J Digit Imaging 2021; 34:1264-1278. [PMID: 34508300 PMCID: PMC8432629 DOI: 10.1007/s10278-021-00504-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 07/16/2021] [Accepted: 08/05/2021] [Indexed: 11/29/2022] Open
Abstract
Maximum intensity projection (MIP) technology is a computer visualization method that projects three-dimensional spatial data on a visualization plane. According to the specific purposes, the specific lab thickness and direction can be selected. This technology can better show organs, such as blood vessels, arteries, veins, and bronchi and so forth, from different directions, which could bring more intuitive and comprehensive results for doctors in the diagnosis of related diseases. However, in this traditional projection technology, the details of the small projected target are not clearly visualized when the projected target is not much different from the surrounding environment, which could lead to missed diagnosis or misdiagnosis. Therefore, it is urgent to develop a new technology that can better and clearly display the angiogram. However, to the best of our knowledge, research in this area is scarce. To fill this gap in the literature, in the present study, we propose a new method based on the hybrid of convolutional neural network (CNN) and radial basis function neural network (RBFNN) to synthesize the projection image. We first adopted the U-net to obtain feature or enhanced images to be projected; subsequently, the RBF neural network performed further synthesis processing for these data; finally, the projection images were obtained. For experimental data, in order to increase the robustness of the proposed algorithm, the following three different types of datasets were adopted: the vascular projection of the brain, the bronchial projection of the lung parenchyma, and the vascular projection of the liver. In addition, radiologist evaluation and five classic metrics of image definition were implemented for effective analysis. Finally, compared to the traditional MIP technology and other structures, the use of a large number of different types of data and superior experimental results proved the versatility and robustness of the proposed method.
Collapse
Affiliation(s)
- Zhen Chao
- College of Artificial Intelligence and Big Data for Medical Sciences, Shandong First Medical University & Shandong Academy of Medical Sciences, Huaiyin District, 6699 Qingdao Road, Jinan, 250117, Shandong, China.
- Research Lab for Medical Imaging and Digital Surgery, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.
- Department of Radiation Convergence Engineering, College of Health Science, Yonsei University, 1 Yonseidae-gil, Wonju, Gangwon, 26493, South Korea.
| | - Wenting Xu
- Department of Radiation Convergence Engineering, College of Health Science, Yonsei University, 1 Yonseidae-gil, Wonju, Gangwon, 26493, South Korea
| |
Collapse
|
95
|
Luo Y, Nie D, Zhan B, Li Z, Wu X, Zhou J, Wang Y, Shen D. Edge-preserving MRI image synthesis via adversarial network with iterative multi-scale fusion. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.04.060] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
96
|
He S, Pereira D, David Perez J, Gollub RL, Murphy SN, Prabhu S, Pienaar R, Robertson RL, Ellen Grant P, Ou Y. Multi-channel attention-fusion neural network for brain age estimation: Accuracy, generality, and interpretation with 16,705 healthy MRIs across lifespan. Med Image Anal 2021; 72:102091. [PMID: 34038818 PMCID: PMC8316301 DOI: 10.1016/j.media.2021.102091] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Revised: 03/10/2021] [Accepted: 04/14/2021] [Indexed: 12/31/2022]
Abstract
Brain age estimated by machine learning from T1-weighted magnetic resonance images (T1w MRIs) can reveal how brain disorders alter brain aging and can help in the early detection of such disorders. A fundamental step is to build an accurate age estimator from healthy brain MRIs. We focus on this step, and propose a framework to improve the accuracy, generality, and interpretation of age estimation in healthy brain MRIs. For accuracy, we used one of the largest sample sizes (N = 16,705). For each subject, our proposed algorithm first explicitly splits the T1w image, which has been commonly treated as a single-channel 3D image in other studies, into two 3D image channels representing contrast and morphometry information. We further proposed a "fusion-with-attention" deep learning convolutional neural network (FiA-Net) to learn how to best fuse the contrast and morphometry image channels. FiA-Net recognizes varying contributions across image channels at different brain anatomy and different feature layers. In contrast, multi-channel fusion does not exist for brain age estimation, and is mostly attention-free in other medical image analysis tasks (e.g., image synthesis, or segmentation), where treating channels equally may not be optimal. For generality, we used lifespan data 0-97 years of age for real-world utility; and we thoroughly tested FiA-Net for multi-site and multi-scanner generality by two phases of cross-validations in discovery and replication data, compared to most other studies with only one phase of cross-validation. For interpretation, we directly measured each artificial neuron's correlation with the chronological age, compared to other studies looking at the saliency of features where salient features may or may not predict age. Overall, FiA-Net achieved a mean absolute error (MAE) of 3.00 years and Pearson correlation r=0.9840 with known chronological ages in healthy brain MRIs 0-97 years of age, comparing favorably with state-of-the-art algorithms and studies for accuracy and generality across sites and datasets. We also provided interpretations on how different artificial neurons and real neuroanatomy contribute to the age estimation.
Collapse
Affiliation(s)
- Sheng He
- Boston Children's Hospital and Harvard Medical School, 300 Longwood Ave., Boston, MA, USA
| | - Diana Pereira
- Boston Children's Hospital and Harvard Medical School, 300 Longwood Ave., Boston, MA, USA
| | - Juan David Perez
- Boston Children's Hospital and Harvard Medical School, 300 Longwood Ave., Boston, MA, USA
| | - Randy L Gollub
- Massachusetts General Hospital and Harvard Medical School, 55 Fruit St., Boston, MA, USA
| | - Shawn N Murphy
- Massachusetts General Hospital and Harvard Medical School, 55 Fruit St., Boston, MA, USA
| | - Sanjay Prabhu
- Boston Children's Hospital and Harvard Medical School, 300 Longwood Ave., Boston, MA, USA
| | - Rudolph Pienaar
- Boston Children's Hospital and Harvard Medical School, 300 Longwood Ave., Boston, MA, USA
| | - Richard L Robertson
- Boston Children's Hospital and Harvard Medical School, 300 Longwood Ave., Boston, MA, USA
| | - P Ellen Grant
- Boston Children's Hospital and Harvard Medical School, 300 Longwood Ave., Boston, MA, USA
| | - Yangming Ou
- Boston Children's Hospital and Harvard Medical School, 300 Longwood Ave., Boston, MA, USA.
| |
Collapse
|
97
|
Zhan B, Li D, Wu X, Zhou J, Wang Y. Multi-modal MRI Image Synthesis via GAN with Multi-scale Gate Mergence. IEEE J Biomed Health Inform 2021; 26:17-26. [PMID: 34125692 DOI: 10.1109/jbhi.2021.3088866] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Multi-modal magnetic resonance imaging (MRI) plays a critical role in clinical diagnosis and treatment nowadays. Each modality of MRI presents its own specific anatomical features which serve as complementary information to other modalities and can provide rich diagnostic information. However, due to the limitations of time consuming and expensive cost, some image sequences of patients may be lost or corrupted, posing an obstacle for accurate diagnosis. Although current multi-modal image synthesis approaches are able to alleviate the issues to some extent, they are still far short of fusing modalities effectively. In light of this, we propose a multi-scale gate mergence based generative adversarial network model, namely MGM-GAN, to synthesize one modality of MRI from others. Notably, we have multiple down-sampling branches corresponding to input modalities to specifically extract their unique features. In contrast to the generic multi-modal fusion approach of averaging or maximizing operations, we introduce a gate mergence (GM) mechanism to automatically learn the weights of different modalities across locations, enhancing the task-related information while suppressing the irrelative information. As such, the feature maps of all the input modalities at each down-sampling level, i.e., multi-scale levels, are integrated via GM module. In addition, both the adversarial loss and the pixel-wise loss, as well as gradient difference loss (GDL) are applied to train the network to produce the desired modality accurately. Extensive experiments demonstrate that the proposed method outperforms the state-of-the-art multi-modal image synthesis methods.
Collapse
|
98
|
Chen RJ, Lu MY, Chen TY, Williamson DFK, Mahmood F. Synthetic data in machine learning for medicine and healthcare. Nat Biomed Eng 2021; 5:493-497. [PMID: 34131324 PMCID: PMC9353344 DOI: 10.1038/s41551-021-00751-8] [Citation(s) in RCA: 211] [Impact Index Per Article: 52.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
The proliferation of synthetic data in artificial intelligence for medicine and healthcare raises concerns about the vulnerabilities of the software and the challenges of current policy.
Collapse
Affiliation(s)
- Richard J Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Ming Y Lu
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Tiffany Y Chen
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Drew F K Williamson
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Faisal Mahmood
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA.
| |
Collapse
|
99
|
Lei H, Liu W, Xie H, Zhao B, Yue G, Lei B. Unsupervised Domain Adaptation Based Image Synthesis and Feature Alignment for Joint Optic Disc and Cup Segmentation. IEEE J Biomed Health Inform 2021; 26:90-102. [PMID: 34061755 DOI: 10.1109/jbhi.2021.3085770] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Due to the discrepancy of different devices for fundus image collection, a well-trained neural network is usually unsuitable for another new dataset. To solve this problem, the unsupervised domain adaptation strategy attracts a lot of attentions. In this paper, we propose an unsupervised domain adaptation method based image synthesis and feature alignment (ISFA) method to segment optic disc and cup on the fundus image. The GAN-based image synthesis (IS) mechanism along with the boundary information of optic disc and cup is utilized to generate target-like query images, which serves as the intermediate latent space between source domain and target domain images to alleviate the domain shift problem. Specifically, we use content and style feature alignment (CSFA) to ensure the feature consistency among source domain images, target-like query images and target domain images. The adversarial learning is used to extract domain invariant features for output-level feature alignment (OLFA). To enhance the representation ability of domain-invariant boundary structure information, we introduce the edge attention module (EAM) for low-level feature maps. Eventually, we train our proposed method on the training set of the REFUGE challenge dataset and test it on Drishti-GS and RIM-ONE_r3 datasets. On the Drishti-GS dataset, our method achieves about 3% improvement of Dice on optic cup segmentation over the next best method. We comprehensively discuss the robustness of our method for small dataset domain adaptation. The experimental results also demonstrate the effectiveness of our method. Our code is available at https://github.com/thinkobj/ISFA.
Collapse
|
100
|
Peng B, Liu B, Bin Y, Shen L, Lei J. Multi-Modality MR Image Synthesis via Confidence-Guided Aggregation and Cross-Modality Refinement. IEEE J Biomed Health Inform 2021; 26:27-35. [PMID: 34018939 DOI: 10.1109/jbhi.2021.3082541] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Magnetic resonance imaging (MRI) can provide multi-modality MR images by setting task-specific scan parameters, and has been widely used in various disease diagnosis and planned treatments. However, in practical clinical applications, it is often difficult to obtain multi-modality MR images simultaneously due to patient discomfort, and scanning costs, etc. Therefore, how to effectively utilize the existing modality images to synthesize missing modality image has become a hot research topic. In this paper, we propose a novel confidence-guided aggregation and cross-modality refinement network (CACR-Net) for multi-modality MR image synthesis, which effectively utilizes complementary and correlative information of multiple modalities to synthesize high-quality target-modality images. Specifically, to effectively utilize the complementary modality-specific characteristics, a confidence-guided aggregation module is proposed to adaptively aggregate the multiple target-modality images generated from multiple source-modality images by using the corresponding confidence maps. Based on the aggregated target-modality image, a cross-modality refinement module is presented to further refine the target-modality image by mining correlative information among the multiple source-modality images and aggregated target-modality image. By training the proposed CACR-Net in an end-to-end manner, high-quality and sharp target-modality MR images are effectively synthesized. Experimental results on the widely used benchmark demonstrate that the proposed method outperforms state-of-the-art methods.
Collapse
|