1
|
Qiu Y, Jiang K, Yao H, Wang Z, Satoh S. Does Adding a Modality Really Make Positive Impacts in Incomplete Multi-Modal Brain Tumor Segmentation? IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:2194-2205. [PMID: 40031068 DOI: 10.1109/tmi.2025.3526818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Previous incomplete multi-modal brain tumor segmentation technologies, while effective in integrating diverse modalities, commonly deliver under-expected performance gains. The reason lies in that the new modality may cause confused predictions due to uncertain and inconsistent patterns and quality in some positions, where the direct fusion consequently raises the negative gain for the final decision. In this paper, considering the potentially negative impacts within a modality, we propose multi-modal Positive-Negative impact region Double Calibration pipeline, called PNDC, to mitigate misinformation transfer of modality fusion. Concretely, PNDC involves two elaborate pipelines, Reverse Audit and Forward Checksum. The former is to identify negative regions impacts of each modality. The latter calibrates whether the fusion prediction is reliable in these regions by integrating the positive impacts regions of each modality. Finally, the negative impacts region from each modality and miss-match reliable fusion predictions are utilized to enhance the learning of individual modalities and fusion process. It is noted that PNDC adopts the standard training strategy without specific architectural choices and does not introduce any learning parameters, and thus can be easily plugged into existing network training for incomplete multi-modal brain tumor segmentation. Extensive experiments confirm that our PNDC greatly alleviates the performance degradation of current state-of-the-art incomplete medical multi-modal methods, arising from overlooking the positive/negative impacts regions of the modality. The code is released at PNDC.
Collapse
|
2
|
Rai HM, Yoo J, Dashkevych S. Transformative Advances in AI for Precise Cancer Detection: A Comprehensive Review of Non-Invasive Techniques. ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING 2025. [DOI: 10.1007/s11831-024-10219-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Accepted: 12/07/2024] [Indexed: 03/02/2025]
|
3
|
Zhang Y, Peng C, Wang Q, Song D, Li K, Kevin Zhou S. Unified Multi-Modal Image Synthesis for Missing Modality Imputation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:4-18. [PMID: 38976465 DOI: 10.1109/tmi.2024.3424785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Multi-modal medical images provide complementary soft-tissue characteristics that aid in the screening and diagnosis of diseases. However, limited scanning time, image corruption and various imaging protocols often result in incomplete multi-modal images, thus limiting the usage of multi-modal data for clinical purposes. To address this issue, in this paper, we propose a novel unified multi-modal image synthesis method for missing modality imputation. Our method overall takes a generative adversarial architecture, which aims to synthesize missing modalities from any combination of available ones with a single model. To this end, we specifically design a Commonality- and Discrepancy-Sensitive Encoder for the generator to exploit both modality-invariant and specific information contained in input modalities. The incorporation of both types of information facilitates the generation of images with consistent anatomy and realistic details of the desired distribution. Besides, we propose a Dynamic Feature Unification Module to integrate information from a varying number of available modalities, which enables the network to be robust to random missing modalities. The module performs both hard integration and soft integration, ensuring the effectiveness of feature combination while avoiding information loss. Verified on two public multi-modal magnetic resonance datasets, the proposed method is effective in handling various synthesis tasks and shows superior performance compared to previous methods.
Collapse
|
4
|
Zhang W, Zhao L, Gou H, Gong Y, Zhou Y, Feng Q. PRSCS-Net: Progressive 3D/2D rigid Registration network with the guidance of Single-view Cycle Synthesis. Med Image Anal 2024; 97:103283. [PMID: 39094463 DOI: 10.1016/j.media.2024.103283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 07/08/2024] [Accepted: 07/17/2024] [Indexed: 08/04/2024]
Abstract
The 3D/2D registration for 3D pre-operative images (computed tomography, CT) and 2D intra-operative images (X-ray) plays an important role in image-guided spine surgeries. Conventional iterative-based approaches suffer from time-consuming processes. Existing learning-based approaches require high computational costs and face poor performance on large misalignment because of projection-induced losses or ill-posed reconstruction. In this paper, we propose a Progressive 3D/2D rigid Registration network with the guidance of Single-view Cycle Synthesis, named PRSCS-Net. Specifically, we first introduce the differentiable backward/forward projection operator into the single-view cycle synthesis network, which reconstructs corresponding 3D geometry features from two 2D intra-operative view images (one from the input, and the other from the synthesis). In this way, the problem of limited views during reconstruction can be solved. Subsequently, we employ a self-reconstruction path to extract latent representation from pre-operative 3D CT images. The following pose estimation process will be performed in the 3D geometry feature space, which can solve the dimensional gap, greatly reduce the computational complexity, and ensure that the features extracted from pre-operative and intra-operative images are as relevant as possible to pose estimation. Furthermore, to enhance the ability of our model for handling large misalignment, we develop a progressive registration path, including two sub-registration networks, aiming to estimate the pose parameters via two-step warping volume features. Finally, our proposed method has been evaluated on a public dataset CTSpine1k and an in-house dataset C-ArmLSpine for 3D/2D registration. Results demonstrate that PRSCS-Net achieves state-of-the-art registration performance in terms of registration accuracy, robustness, and generalizability compared with existing methods. Thus, PRSCS-Net has potential for clinical spinal disease surgical planning and surgical navigation systems.
Collapse
Affiliation(s)
- Wencong Zhang
- School of Biomedical Engineering, Southern Medical University, Guangzhou, 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510515, China
| | - Lei Zhao
- School of Biomedical Engineering, Southern Medical University, Guangzhou, 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510515, China
| | - Hang Gou
- School of Biomedical Engineering, Southern Medical University, Guangzhou, 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510515, China
| | - Yanggang Gong
- School of Biomedical Engineering, Southern Medical University, Guangzhou, 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510515, China
| | - Yujia Zhou
- School of Biomedical Engineering, Southern Medical University, Guangzhou, 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510515, China.
| | - Qianjin Feng
- School of Biomedical Engineering, Southern Medical University, Guangzhou, 510515, China; Guangdong Provincial Key Laboratory of Medical Image Processing, Southern Medical University, Guangzhou, 510515, China; Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology, Southern Medical University, Guangzhou, 510515, China.
| |
Collapse
|
5
|
Meng X, Sun K, Xu J, He X, Shen D. Multi-Modal Modality-Masked Diffusion Network for Brain MRI Synthesis With Random Modality Missing. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2587-2598. [PMID: 38393846 DOI: 10.1109/tmi.2024.3368664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/25/2024]
Abstract
Synthesis of unavailable imaging modalities from available ones can generate modality-specific complementary information and enable multi-modality based medical images diagnosis or treatment. Existing generative methods for medical image synthesis are usually based on cross-modal translation between acquired and missing modalities. These methods are usually dedicated to specific missing modality and perform synthesis in one shot, which cannot deal with varying number of missing modalities flexibly and construct the mapping across modalities effectively. To address the above issues, in this paper, we propose a unified Multi-modal Modality-masked Diffusion Network (M2DN), tackling multi-modal synthesis from the perspective of "progressive whole-modality inpainting", instead of "cross-modal translation". Specifically, our M2DN considers the missing modalities as random noise and takes all the modalities as a unity in each reverse diffusion step. The proposed joint synthesis scheme performs synthesis for the missing modalities and self-reconstruction for the available ones, which not only enables synthesis for arbitrary missing scenarios, but also facilitates the construction of common latent space and enhances the model representation ability. Besides, we introduce a modality-mask scheme to encode availability status of each incoming modality explicitly in a binary mask, which is adopted as condition for the diffusion model to further enhance the synthesis performance of our M2DN for arbitrary missing scenarios. We carry out experiments on two public brain MRI datasets for synthesis and downstream segmentation tasks. Experimental results demonstrate that our M2DN outperforms the state-of-the-art models significantly and shows great generalizability for arbitrary missing modalities.
Collapse
|
6
|
Wang P, Zhang H, Zhu M, Jiang X, Qin J, Yuan Y. MGIML: Cancer Grading With Incomplete Radiology-Pathology Data via Memory Learning and Gradient Homogenization. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2113-2124. [PMID: 38231819 DOI: 10.1109/tmi.2024.3355142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]
Abstract
Taking advantage of multi-modal radiology-pathology data with complementary clinical information for cancer grading is helpful for doctors to improve diagnosis efficiency and accuracy. However, radiology and pathology data have distinct acquisition difficulties and costs, which leads to incomplete-modality data being common in applications. In this work, we propose a Memory- and Gradient-guided Incomplete Modal-modal Learning (MGIML) framework for cancer grading with incomplete radiology-pathology data. Firstly, to remedy missing-modality information, we propose a Memory-driven Hetero-modality Complement (MH-Complete) scheme, which constructs modal-specific memory banks constrained by a coarse-grained memory boosting (CMB) loss to record generic radiology and pathology feature patterns, and develops a cross-modal memory reading strategy enhanced by a fine-grained memory consistency (FMC) loss to take missing-modality information from well-stored memories. Secondly, as gradient conflicts exist between missing-modality situations, we propose a Rotation-driven Gradient Homogenization (RG-Homogenize) scheme, which estimates instance-specific rotation matrices to smoothly change the feature-level gradient directions, and computes confidence-guided homogenization weights to dynamically balance gradient magnitudes. By simultaneously mitigating gradient direction and magnitude conflicts, this scheme well avoids the negative transfer and optimization imbalance problems. Extensive experiments on CPTAC-UCEC and CPTAC-PDA datasets show that the proposed MGIML framework performs favorably against state-of-the-art multi-modal methods on missing-modality situations.
Collapse
|
7
|
Liu Z, Yang B, Shen Y, Ni X, Tsaftaris SA, Zhou H. Long-short diffeomorphism memory network for weakly-supervised ultrasound landmark tracking. Med Image Anal 2024; 94:103138. [PMID: 38479152 DOI: 10.1016/j.media.2024.103138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 01/26/2024] [Accepted: 03/05/2024] [Indexed: 04/16/2024]
Abstract
Ultrasound is a promising medical imaging modality benefiting from low-cost and real-time acquisition. Accurate tracking of an anatomical landmark has been of high interest for various clinical workflows such as minimally invasive surgery and ultrasound-guided radiation therapy. However, tracking an anatomical landmark accurately in ultrasound video is very challenging, due to landmark deformation, visual ambiguity and partial observation. In this paper, we propose a long-short diffeomorphism memory network (LSDM), which is a multi-task framework with an auxiliary learnable deformation prior to supporting accurate landmark tracking. Specifically, we design a novel diffeomorphic representation, which contains both long and short temporal information stored in separate memory banks for delineating motion margins and reducing cumulative errors. We further propose an expectation maximization memory alignment (EMMA) algorithm to iteratively optimize both the long and short deformation memory, updating the memory queue for mitigating local anatomical ambiguity. The proposed multi-task system can be trained in a weakly-supervised manner, which only requires few landmark annotations for tracking and zero annotation for deformation learning. We conduct extensive experiments on both public and private ultrasound landmark tracking datasets. Experimental results show that LSDM can achieve better or competitive landmark tracking performance with a strong generalization capability across different scanner types and different ultrasound modalities, compared with other state-of-the-art methods.
Collapse
Affiliation(s)
- Zhihua Liu
- School of Computing and Mathematical Sciences, University of Leicester, Leicester, LE1 7RH, UK
| | - Bin Yang
- Department of Cardiovascular Sciences, University Hospitals of Leicester NHS Trust, Leicester, LE1 9HN, UK; Nantong-Leicester Joint Institute of Kidney Science, Department of Nephrology, Affiliated Hospital of Nantong University, Nantong, 226001, China
| | - Yan Shen
- Department of Emergency Medicine, Affiliated Hospital of Nantong University, Nantong, 226001, China
| | - Xuejun Ni
- Department of Emergency Medicine, Affiliated Hospital of Nantong University, Nantong, 226001, China
| | - Sotirios A Tsaftaris
- School of Engineering, The University of Edinburgh, Edinburgh EH9 3FG, UK; The Alan Turing Institute, London NW1 2DB, UK
| | - Huiyu Zhou
- School of Computing and Mathematical Sciences, University of Leicester, Leicester, LE1 7RH, UK.
| |
Collapse
|
8
|
Qiu L, Zhao L, Zhao W, Zhao J. Dual-space disentangled-multimodal network (DDM-net) for glioma diagnosis and prognosis with incomplete pathology and genomic data. Phys Med Biol 2024; 69:085028. [PMID: 38595094 DOI: 10.1088/1361-6560/ad37ec] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 03/26/2024] [Indexed: 04/11/2024]
Abstract
Objective. Effective fusion of histology slides and molecular profiles from genomic data has shown great potential in the diagnosis and prognosis of gliomas. However, it remains challenging to explicitly utilize the consistent-complementary information among different modalities and create comprehensive representations of patients. Additionally, existing researches mainly focus on complete multi-modality data and usually fail to construct robust models for incomplete samples.Approach. In this paper, we propose adual-space disentangled-multimodal network (DDM-net)for glioma diagnosis and prognosis. DDM-net disentangles the latent features generated by two separate variational autoencoders (VAEs) into common and specific components through a dual-space disentangled approach, facilitating the construction of comprehensive representations of patients. More importantly, DDM-net imputes the unavailable modality in the latent feature space, making it robust to incomplete samples.Main results. We evaluated our approach on the TCGA-GBMLGG dataset for glioma grading and survival analysis tasks. Experimental results demonstrate that the proposed method achieves superior performance compared to state-of-the-art methods, with a competitive AUC of 0.952 and a C-index of 0.768.Significance. The proposed model may help the clinical understanding of gliomas and can serve as an effective fusion model with multimodal data. Additionally, it is capable of handling incomplete samples, making it less constrained by clinical limitations.
Collapse
Affiliation(s)
- Lu Qiu
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, People's Republic of China
| | - Lu Zhao
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, People's Republic of China
| | - Wangyuan Zhao
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, People's Republic of China
| | - Jun Zhao
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, People's Republic of China
| |
Collapse
|
9
|
Dayarathna S, Islam KT, Uribe S, Yang G, Hayat M, Chen Z. Deep learning based synthesis of MRI, CT and PET: Review and analysis. Med Image Anal 2024; 92:103046. [PMID: 38052145 DOI: 10.1016/j.media.2023.103046] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 11/14/2023] [Accepted: 11/29/2023] [Indexed: 12/07/2023]
Abstract
Medical image synthesis represents a critical area of research in clinical decision-making, aiming to overcome the challenges associated with acquiring multiple image modalities for an accurate clinical workflow. This approach proves beneficial in estimating an image of a desired modality from a given source modality among the most common medical imaging contrasts, such as Computed Tomography (CT), Magnetic Resonance Imaging (MRI), and Positron Emission Tomography (PET). However, translating between two image modalities presents difficulties due to the complex and non-linear domain mappings. Deep learning-based generative modelling has exhibited superior performance in synthetic image contrast applications compared to conventional image synthesis methods. This survey comprehensively reviews deep learning-based medical imaging translation from 2018 to 2023 on pseudo-CT, synthetic MR, and synthetic PET. We provide an overview of synthetic contrasts in medical imaging and the most frequently employed deep learning networks for medical image synthesis. Additionally, we conduct a detailed analysis of each synthesis method, focusing on their diverse model designs based on input domains and network architectures. We also analyse novel network architectures, ranging from conventional CNNs to the recent Transformer and Diffusion models. This analysis includes comparing loss functions, available datasets and anatomical regions, and image quality assessments and performance in other downstream tasks. Finally, we discuss the challenges and identify solutions within the literature, suggesting possible future directions. We hope that the insights offered in this survey paper will serve as a valuable roadmap for researchers in the field of medical image synthesis.
Collapse
Affiliation(s)
- Sanuwani Dayarathna
- Department of Data Science and AI, Faculty of Information Technology, Monash University, Clayton VIC 3800, Australia.
| | | | - Sergio Uribe
- Department of Medical Imaging and Radiation Sciences, Faculty of Medicine, Monash University, Clayton VIC 3800, Australia
| | - Guang Yang
- Bioengineering Department and Imperial-X, Imperial College London, W12 7SL, United Kingdom
| | - Munawar Hayat
- Department of Data Science and AI, Faculty of Information Technology, Monash University, Clayton VIC 3800, Australia
| | - Zhaolin Chen
- Department of Data Science and AI, Faculty of Information Technology, Monash University, Clayton VIC 3800, Australia; Monash Biomedical Imaging, Clayton VIC 3800, Australia
| |
Collapse
|
10
|
Raad R, Ray D, Varghese B, Hwang D, Gill I, Duddalwar V, Oberai AA. Conditional generative learning for medical image imputation. Sci Rep 2024; 14:171. [PMID: 38167932 PMCID: PMC10762085 DOI: 10.1038/s41598-023-50566-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 12/21/2023] [Indexed: 01/05/2024] Open
Abstract
Image imputation refers to the task of generating a type of medical image given images of another type. This task becomes challenging when the difference between the available images, and the image to be imputed is large. In this manuscript, one such application is considered. It is derived from the dynamic contrast enhanced computed tomography (CECT) imaging of the kidneys: given an incomplete sequence of three CECT images, we are required to impute the missing image. This task is posed as one of probabilistic inference and a generative algorithm to generate samples of the imputed image, conditioned on the available images, is developed, trained, and tested. The output of this algorithm is the "best guess" of the imputed image, and a pixel-wise image of variance in the imputation. It is demonstrated that this best guess is more accurate than those generated by other, deterministic deep-learning based algorithms, including ones which utilize additional information and more complex loss terms. It is also shown that the pixel-wise variance image, which quantifies the confidence in the reconstruction, can be used to determine whether the result of the imputation meets a specified accuracy threshold and is therefore appropriate for a downstream task.
Collapse
Affiliation(s)
- Ragheb Raad
- Aerospace and Mechanical Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, CA, 90089, USA
| | - Deep Ray
- Department of Mathematics, University of Maryland, College Park, MD, 20742, USA
| | - Bino Varghese
- Radiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, 90033, USA
| | - Darryl Hwang
- Radiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, 90033, USA
| | - Inderbir Gill
- Urology, Keck School of Medicine, University of Southern California, Los Angeles, CA, 90033, USA
| | - Vinay Duddalwar
- Radiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, 90033, USA
| | - Assad A Oberai
- Aerospace and Mechanical Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, CA, 90089, USA.
| |
Collapse
|
11
|
Yang H, Sun J, Xu Z. Learning Unified Hyper-Network for Multi-Modal MR Image Synthesis and Tumor Segmentation With Missing Modalities. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3678-3689. [PMID: 37540616 DOI: 10.1109/tmi.2023.3301934] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/06/2023]
Abstract
Accurate segmentation of brain tumors is of critical importance in clinical assessment and treatment planning, which requires multiple MR modalities providing complementary information. However, due to practical limits, one or more modalities may be missing in real scenarios. To tackle this problem, existing methods need to train multiple networks or a unified but fixed network for various possible missing modality cases, which leads to high computational burdens or sub-optimal performance. In this paper, we propose a unified and adaptive multi-modal MR image synthesis method, and further apply it to tumor segmentation with missing modalities. Based on the decomposition of multi-modal MR images into common and modality-specific features, we design a shared hyper-encoder for embedding each available modality into the feature space, a graph-attention-based fusion block to aggregate the features of available modalities to the fused features, and a shared hyper-decoder for image reconstruction. We also propose an adversarial common feature constraint to enforce the fused features to be in a common space. As for missing modality segmentation, we first conduct the feature-level and image-level completion using our synthesis method and then segment the tumors based on the completed MR images together with the extracted common features. Moreover, we design a hypernet-based modulation module to adaptively utilize the real and synthetic modalities. Experimental results suggest that our method can not only synthesize reasonable multi-modal MR images, but also achieve state-of-the-art performance on brain tumor segmentation with missing modalities.
Collapse
|
12
|
Diao Y, Li F, Li Z. Joint learning-based feature reconstruction and enhanced network for incomplete multi-modal brain tumor segmentation. Comput Biol Med 2023; 163:107234. [PMID: 37450967 DOI: 10.1016/j.compbiomed.2023.107234] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 06/12/2023] [Accepted: 07/01/2023] [Indexed: 07/18/2023]
Abstract
Multimodal Magnetic Resonance Imaging (MRI) can provide valuable complementary information and substantially enhance the performance of brain tumor segmentation. However, it is common for certain modalities to be absent or missing during clinical diagnosis, which can significantly impair segmentation techniques that rely on complete modalities. Current advanced methods attempt to address this challenge by developing shared feature representations via modal fusion to handle different missing modality situations. Considering the importance of missing modality information in multimodal segmentation, this paper utilize a feature reconstruction method to recover the missing information, and proposes a joint learning-based feature reconstruction and enhancement method for incomplete modality brain tumor segmentation. The method leverages an information learning mechanism to transfer information from the complete modality to a single modality, enabling it to obtain complete brain tumor information, even without the support of other modalities. Additionally, the method incorporates a module for reconstructing missing modality features, which recovers fused features of the absent modality through utilizing the abundant potential information obtained from the available modalities. Furthermore, the feature enhancement mechanism improves shared feature representation by utilizing the information obtained from the missing modalities that have been reconstructed. These processes enable the method to obtain more comprehensive information regarding brain tumors in various missing modality circumstances, thereby enhancing the model's robustness. The performance of the proposed model was evaluated on BraTS datasets and compared with other deep learning algorithms using Dice similarity scores. On the BraTS2018 dataset, the proposed algorithm achieved a Dice similarity score of 86.28%, 77.02%, and 59.64% for whole tumors, tumor cores, and enhanced tumors, respectively. These results demonstrate the superiority of our framework over state-of-the-art methods in missing modalities situations.
Collapse
Affiliation(s)
- Yueqin Diao
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China; Yunnan Key Laboratory of Artificial Intelligence, Kunming 650500, China.
| | - Fan Li
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China; Yunnan Key Laboratory of Artificial Intelligence, Kunming 650500, China.
| | - Zhiyuan Li
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China; Yunnan Key Laboratory of Artificial Intelligence, Kunming 650500, China.
| |
Collapse
|
13
|
Liu J, Pasumarthi S, Duffy B, Gong E, Datta K, Zaharchuk G. One Model to Synthesize Them All: Multi-Contrast Multi-Scale Transformer for Missing Data Imputation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:2577-2591. [PMID: 37030684 PMCID: PMC10543020 DOI: 10.1109/tmi.2023.3261707] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Multi-contrast magnetic resonance imaging (MRI) is widely used in clinical practice as each contrast provides complementary information. However, the availability of each imaging contrast may vary amongst patients, which poses challenges to radiologists and automated image analysis algorithms. A general approach for tackling this problem is missing data imputation, which aims to synthesize the missing contrasts from existing ones. While several convolutional neural networks (CNN) based algorithms have been proposed, they suffer from the fundamental limitations of CNN models, such as the requirement for fixed numbers of input and output channels, the inability to capture long-range dependencies, and the lack of interpretability. In this work, we formulate missing data imputation as a sequence-to-sequence learning problem and propose a multi-contrast multi-scale Transformer (MMT), which can take any subset of input contrasts and synthesize those that are missing. MMT consists of a multi-scale Transformer encoder that builds hierarchical representations of inputs combined with a multi-scale Transformer decoder that generates the outputs in a coarse-to-fine fashion. The proposed multi-contrast Swin Transformer blocks can efficiently capture intra- and inter-contrast dependencies for accurate image synthesis. Moreover, MMT is inherently interpretable as it allows us to understand the importance of each input contrast in different regions by analyzing the in-built attention maps of Transformer blocks in the decoder. Extensive experiments on two large-scale multi-contrast MRI datasets demonstrate that MMT outperforms the state-of-the-art methods quantitatively and qualitatively.
Collapse
|
14
|
Xu K, Li T, Khan MS, Gao R, Antic SL, Huo Y, Sandler KL, Maldonado F, Landman BA. Body composition assessment with limited field-of-view computed tomography: A semantic image extension perspective. Med Image Anal 2023; 88:102852. [PMID: 37276799 PMCID: PMC10527087 DOI: 10.1016/j.media.2023.102852] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 01/30/2023] [Accepted: 05/23/2023] [Indexed: 06/07/2023]
Abstract
Field-of-view (FOV) tissue truncation beyond the lungs is common in routine lung screening computed tomography (CT). This poses limitations for opportunistic CT-based body composition (BC) assessment as key anatomical structures are missing. Traditionally, extending the FOV of CT is considered as a CT reconstruction problem using limited data. However, this approach relies on the projection domain data which might not be available in application. In this work, we formulate the problem from the semantic image extension perspective which only requires image data as inputs. The proposed two-stage method identifies a new FOV border based on the estimated extent of the complete body and imputes missing tissues in the truncated region. The training samples are simulated using CT slices with complete body in FOV, making the model development self-supervised. We evaluate the validity of the proposed method in automatic BC assessment using lung screening CT with limited FOV. The proposed method effectively restores the missing tissues and reduces BC assessment error introduced by FOV tissue truncation. In the BC assessment for large-scale lung screening CT datasets, this correction improves both the intra-subject consistency and the correlation with anthropometric approximations. The developed method is available at https://github.com/MASILab/S-EFOV.
Collapse
Affiliation(s)
- Kaiwen Xu
- Vanderbilt University, 2301 Vanderbilt Place, Nashville, 37235, United States.
| | - Thomas Li
- Vanderbilt University, 2301 Vanderbilt Place, Nashville, 37235, United States
| | - Mirza S Khan
- Vanderbilt University Medical Center, 1211 Medical Center Drive, Nashville, 37232, United States
| | - Riqiang Gao
- Vanderbilt University, 2301 Vanderbilt Place, Nashville, 37235, United States
| | - Sanja L Antic
- Vanderbilt University Medical Center, 1211 Medical Center Drive, Nashville, 37232, United States
| | - Yuankai Huo
- Vanderbilt University, 2301 Vanderbilt Place, Nashville, 37235, United States
| | - Kim L Sandler
- Vanderbilt University Medical Center, 1211 Medical Center Drive, Nashville, 37232, United States
| | - Fabien Maldonado
- Vanderbilt University Medical Center, 1211 Medical Center Drive, Nashville, 37232, United States
| | - Bennett A Landman
- Vanderbilt University, 2301 Vanderbilt Place, Nashville, 37235, United States; Vanderbilt University Medical Center, 1211 Medical Center Drive, Nashville, 37232, United States
| |
Collapse
|
15
|
Liu Z, Wei J, Li R, Zhou J. Learning multi-modal brain tumor segmentation from privileged semi-paired MRI images with curriculum disentanglement learning. Comput Biol Med 2023; 159:106927. [PMID: 37105113 DOI: 10.1016/j.compbiomed.2023.106927] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2022] [Revised: 04/02/2023] [Accepted: 04/13/2023] [Indexed: 04/29/2023]
Abstract
Since the brain is the human body's primary command and control center, brain cancer is one of the most dangerous cancers. Automatic segmentation of brain tumors from multi-modal images is important in diagnosis and treatment. Due to the difficulties in obtaining multi-modal paired images in clinical practice, recent studies segment brain tumors solely relying on unpaired images and discarding the available paired images. Although these models solve the dependence on paired images, they cannot fully exploit the complementary information from different modalities, resulting in low unimodal segmentation accuracy. Hence, this work studies the unimodal segmentation with privileged semi-paired images, i.e., limited paired images are introduced to the training phase. Specifically, we present a novel two-step (intra-modality and inter-modality) curriculum disentanglement learning framework. The modality-specific style codes describe the attenuation of tissue features and image contrast, and modality-invariant content codes contain anatomical and functional information extracted from the input images. Besides, we address the problem of unthorough decoupling by introducing constraints on the style and content spaces. Experiments on the BraTS2020 dataset highlight that our model outperforms the competing models on unimodal segmentation, achieving average dice scores of 82.91%, 72.62%, and 54.80% for WT (the whole tumor), TC (the tumor core), and ET (the enhancing tumor), respectively. Finally, we further evaluate our model's variable multi-modal brain tumor segmentation performance by introducing a fusion block (TFusion). The experimental results reveal that our model achieves the best WT segmentation performance for all 15 possible modality combinations with 87.31% average accuracy. In summary, we propose a curriculum disentanglement learning framework for unimodal segmentation with privileged semi-paired images. Moreover, the benefits of the improved unimodal segmentation extend to variable multi-modal segmentation, demonstrating that improving the unimodal segmentation performance is significant for brain tumor segmentation with missing modalities. Our code is available at https://github.com/scut-cszcl/SpBTS.
Collapse
Affiliation(s)
- Zecheng Liu
- School of Computer Science and Engineering, South China University of Technology, Guangzhou, China.
| | - Jia Wei
- School of Computer Science and Engineering, South China University of Technology, Guangzhou, China.
| | - Rui Li
- Golisano College of Computing and Information Sciences, Rochester Institute of Technology, Rochester, NY, USA.
| | - Jianlong Zhou
- Data Science Institute, University of Technology Sydney, Ultimo, NSW 2007, Australia.
| |
Collapse
|
16
|
A Systematic Literature Review on Applications of GAN-Synthesized Images for Brain MRI. FUTURE INTERNET 2022. [DOI: 10.3390/fi14120351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
With the advances in brain imaging, magnetic resonance imaging (MRI) is evolving as a popular radiological tool in clinical diagnosis. Deep learning (DL) methods can detect abnormalities in brain images without an extensive manual feature extraction process. Generative adversarial network (GAN)-synthesized images have many applications in this field besides augmentation, such as image translation, registration, super-resolution, denoising, motion correction, segmentation, reconstruction, and contrast enhancement. The existing literature was reviewed systematically to understand the role of GAN-synthesized dummy images in brain disease diagnosis. Web of Science and Scopus databases were extensively searched to find relevant studies from the last 6 years to write this systematic literature review (SLR). Predefined inclusion and exclusion criteria helped in filtering the search results. Data extraction is based on related research questions (RQ). This SLR identifies various loss functions used in the above applications and software to process brain MRIs. A comparative study of existing evaluation metrics for GAN-synthesized images helps choose the proper metric for an application. GAN-synthesized images will have a crucial role in the clinical sector in the coming years, and this paper gives a baseline for other researchers in the field.
Collapse
|
17
|
Zhang X, He X, Guo J, Ettehadi N, Aw N, Semanek D, Posner J, Laine A, Wang Y. PTNet3D: A 3D High-Resolution Longitudinal Infant Brain MRI Synthesizer Based on Transformers. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:2925-2940. [PMID: 35560070 PMCID: PMC9529847 DOI: 10.1109/tmi.2022.3174827] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
An increased interest in longitudinal neurodevelopment during the first few years after birth has emerged in recent years. Noninvasive magnetic resonance imaging (MRI) can provide crucial information about the development of brain structures in the early months of life. Despite the success of MRI collections and analysis for adults, it remains a challenge for researchers to collect high-quality multimodal MRIs from developing infant brains because of their irregular sleep pattern, limited attention, inability to follow instructions to stay still during scanning. In addition, there are limited analytic approaches available. These challenges often lead to a significant reduction of usable MRI scans and pose a problem for modeling neurodevelopmental trajectories. Researchers have explored solving this problem by synthesizing realistic MRIs to replace corrupted ones. Among synthesis methods, the convolutional neural network-based (CNN-based) generative adversarial networks (GANs) have demonstrated promising performance. In this study, we introduced a novel 3D MRI synthesis framework- pyramid transformer network (PTNet3D)- which relies on attention mechanisms through transformer and performer layers. We conducted extensive experiments on high-resolution Developing Human Connectome Project (dHCP) and longitudinal Baby Connectome Project (BCP) datasets. Compared with CNN-based GANs, PTNet3D consistently shows superior synthesis accuracy and superior generalization on two independent, large-scale infant brain MRI datasets. Notably, we demonstrate that PTNet3D synthesized more realistic scans than CNN-based models when the input is from multi-age subjects. Potential applications of PTNet3D include synthesizing corrupted or missing images. By replacing corrupted scans with synthesized ones, we observed significant improvement in infant whole brain segmentation.
Collapse
|
18
|
A joint ventricle and WMH segmentation from MRI for evaluation of healthy and pathological changes in the aging brain. PLoS One 2022; 17:e0274212. [PMID: 36067136 PMCID: PMC9447923 DOI: 10.1371/journal.pone.0274212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 08/23/2022] [Indexed: 11/20/2022] Open
Abstract
Age-related changes in brain structure include atrophy of the brain parenchyma and white matter changes of presumed vascular origin. Enlargement of the ventricles may occur due to atrophy or impaired cerebrospinal fluid (CSF) circulation. The co-occurrence of these changes in neurodegenerative diseases and in aging brains often requires investigators to take both into account when studying the brain, however, automated segmentation of enlarged ventricles and white matter hyperintensities (WMHs) can be a challenging task. Here, we present a hybrid multi-atlas segmentation and convolutional autoencoder approach for joint ventricle parcellation and WMH segmentation from magnetic resonance images (MRIs). Our fully automated approach uses a convolutional autoencoder to generate a standardized image of grey matter, white matter, CSF, and WMHs, which, in conjunction with labels generated by a multi-atlas segmentation approach, is then fed into a convolutional neural network to parcellate the ventricular system. Hence, our approach does not depend on manually delineated training data for new data sets. The segmentation pipeline was validated on both healthy elderly subjects and subjects with normal pressure hydrocephalus using ground truth manual labels and compared with state-of-the-art segmentation methods. We then applied the method to a cohort of 2401 elderly brains to investigate associations of ventricle volume and WMH load with various demographics and clinical biomarkers, using a multiple regression model. Our results indicate that the ventricle volume and WMH load are both highly variable in a cohort of elderly subjects and there is an independent association between the two, which highlights the importance of taking both the possibility of enlarged ventricles and WMHs into account when studying the aging brain.
Collapse
|
19
|
Liu L, Shen L, Johansson A, Balter JM, Cao Y, Chang D, Xing IL. Real time volumetric MRI for 3D motion tracking via geometry-informed deep learning. Med Phys 2022; 49:6110-6119. [PMID: 35766221 PMCID: PMC10323755 DOI: 10.1002/mp.15822] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 04/26/2022] [Accepted: 06/02/2022] [Indexed: 11/07/2022] Open
Abstract
PURPOSE To develop a geometry-informed deep learning framework for volumetric MRI with sub-second acquisition time in support of 3D motion tracking, which is highly desirable for improved radiotherapy precision but hindered by the long image acquisition time. METHODS A 2D-3D deep learning network with an explicitly defined geometry module that embeds geometric priors of the k-space encoding pattern was investigated, where a 2D generation network first augmented the sparsely sampled image dataset by generating new 2D representations of the underlying 3D subject. A geometry module then unfolded the 2D representations to the volumetric space. Finally, a 3D refinement network took the unfolded 3D data and outputted high-resolution volumetric images. Patient-specific models were trained for seven abdominal patients to reconstruct volumetric MRI from both orthogonal cine slices and sparse radial samples. To evaluate the robustness of the proposed method to longitudinal patient anatomy and position changes, we tested the trained model on separate datasets acquired more than one month later and evaluated 3D target motion tracking accuracy using the model-reconstructed images by deforming a reference MRI with gross tumor volume (GTV) contours to a 5-min time series of both ground truth and model-reconstructed volumetric images with a temporal resolution of 340 ms. RESULTS Across the seven patients evaluated, the median distances between model-predicted and ground truth GTV centroids in the superior-inferior direction were 0.4 ± 0.3 mm and 0.5 ± 0.4 mm for cine and radial acquisitions, respectively. The 95-percentile Hausdorff distances between model-predicted and ground truth GTV contours were 4.7 ± 1.1 mm and 3.2 ± 1.5 mm for cine and radial acquisitions, which are of the same scale as cross-plane image resolution. CONCLUSION Incorporating geometric priors into deep learning model enables volumetric imaging with high spatial and temporal resolution, which is particularly valuable for 3D motion tracking and has the potential of greatly improving MRI-guided radiotherapy precision.
Collapse
Affiliation(s)
- Lianli Liu
- Department of Radiation Oncology, Stanford University, Palo Alto, California, USA
| | - Liyue Shen
- Department of Electrical Engineering, Stanford University, Palo Alto, California, USA
| | - Adam Johansson
- Department of Radiation Oncology, University of Michigan, Ann Arbor, Michigan, USA
- Department of Immunology Genetics and pathology, Uppsala University, Uppsala, Sweden
- Department of Surgical Sciences, Uppsala University, Uppsala, Sweden
| | - James M. Balter
- Department of Radiation Oncology, University of Michigan, Ann Arbor, Michigan, USA
| | - Yue Cao
- Department of Radiation Oncology, University of Michigan, Ann Arbor, Michigan, USA
| | - Daniel Chang
- Department of Radiation Oncology, Stanford University, Palo Alto, California, USA
| | - I Lei Xing
- Department of Radiation Oncology, Stanford University, Palo Alto, California, USA
- Department of Electrical Engineering, Stanford University, Palo Alto, California, USA
| |
Collapse
|
20
|
Zhan B, Zhou L, Li Z, Wu X, Pu Y, Zhou J, Wang Y, Shen D. D2FE-GAN: Decoupled dual feature extraction based GAN for MRI image synthesis. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
21
|
Shen L, Yu L, Zhao W, Pauly J, Xing L. Novel-view X-ray projection synthesis through geometry-integrated deep learning. Med Image Anal 2022; 77:102372. [PMID: 35131701 PMCID: PMC8916089 DOI: 10.1016/j.media.2022.102372] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Revised: 01/14/2022] [Accepted: 01/16/2022] [Indexed: 01/12/2023]
Abstract
X-ray imaging is a widely used approach to view the internal structure of a subject for clinical diagnosis, image-guided interventions and decision-making. The X-ray projections acquired at different view angles provide complementary information of patient's anatomy and are required for stereoscopic or volumetric imaging of the subject. In reality, obtaining multiple-view projections inevitably increases radiation dose and complicates clinical workflow. Here we investigate a strategy of obtaining the X-ray projection image at a novel view angle from a given projection image at a specific view angle to alleviate the need for actual projection measurement. Specifically, a Deep Learning-based Geometry-Integrated Projection Synthesis (DL-GIPS) framework is proposed for the generation of novel-view X-ray projections. The proposed deep learning model extracts geometry and texture features from a source-view projection, and then conducts geometry transformation on the geometry features to accommodate the change of view angle. At the final stage, the X-ray projection in the target view is synthesized from the transformed geometry and the shared texture features via an image generator. The feasibility and potential impact of the proposed DL-GIPS model are demonstrated using lung imaging cases. The proposed strategy can be generalized to a general case of multiple projections synthesis from multiple input views and potentially provides a new paradigm for various stereoscopic and volumetric imaging with substantially reduced efforts in data acquisition.
Collapse
Affiliation(s)
- Liyue Shen
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA.
| | - Lequan Yu
- Department of Radiation Oncology, Stanford University, Stanford, CA, USA
| | - Wei Zhao
- Department of Radiation Oncology, Stanford University, Stanford, CA, USA
| | - John Pauly
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA
| | - Lei Xing
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA; Department of Radiation Oncology, Stanford University, Stanford, CA, USA
| |
Collapse
|
22
|
Representation Disentanglement for Multi-modal Brain MRI Analysis. INFORMATION PROCESSING IN MEDICAL IMAGING : PROCEEDINGS OF THE ... CONFERENCE 2021; 12729:321-333. [PMID: 35173402 DOI: 10.1007/978-3-030-78191-0_25] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Multi-modal MRIs are widely used in neuroimaging applications since different MR sequences provide complementary information about brain structures. Recent works have suggested that multi-modal deep learning analysis can benefit from explicitly disentangling anatomical (shape) and modality (appearance) information into separate image presentations. In this work, we challenge mainstream strategies by showing that they do not naturally lead to representation disentanglement both in theory and in practice. To address this issue, we propose a margin loss that regularizes the similarity in relationships of the representations across subjects and modalities. To enable robust training, we further use a conditional convolution to design a single model for encoding images of all modalities. Lastly, we propose a fusion function to combine the disentangled anatomical representations as a set of modality-invariant features for downstream tasks. We evaluate the proposed method on three multi-modal neuroimaging datasets. Experiments show that our proposed method can achieve superior disentangled representations compared to existing disentanglement strategies. Results also indicate that the fused anatomical representation has potential in the downstream task of zero-dose PET reconstruction and brain tumor segmentation.
Collapse
|
23
|
Peng B, Liu B, Bin Y, Shen L, Lei J. Multi-Modality MR Image Synthesis via Confidence-Guided Aggregation and Cross-Modality Refinement. IEEE J Biomed Health Inform 2021; 26:27-35. [PMID: 34018939 DOI: 10.1109/jbhi.2021.3082541] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Magnetic resonance imaging (MRI) can provide multi-modality MR images by setting task-specific scan parameters, and has been widely used in various disease diagnosis and planned treatments. However, in practical clinical applications, it is often difficult to obtain multi-modality MR images simultaneously due to patient discomfort, and scanning costs, etc. Therefore, how to effectively utilize the existing modality images to synthesize missing modality image has become a hot research topic. In this paper, we propose a novel confidence-guided aggregation and cross-modality refinement network (CACR-Net) for multi-modality MR image synthesis, which effectively utilizes complementary and correlative information of multiple modalities to synthesize high-quality target-modality images. Specifically, to effectively utilize the complementary modality-specific characteristics, a confidence-guided aggregation module is proposed to adaptively aggregate the multiple target-modality images generated from multiple source-modality images by using the corresponding confidence maps. Based on the aggregated target-modality image, a cross-modality refinement module is presented to further refine the target-modality image by mining correlative information among the multiple source-modality images and aggregated target-modality image. By training the proposed CACR-Net in an end-to-end manner, high-quality and sharp target-modality MR images are effectively synthesized. Experimental results on the widely used benchmark demonstrate that the proposed method outperforms state-of-the-art methods.
Collapse
|