1
|
Adamson PM, Desai AD, Dominic J, Varma M, Bluethgen C, Wood JP, Syed AB, Boutin RD, Stevens KJ, Vasanawala S, Pauly JM, Gunel B, Chaudhari AS. Using deep feature distances for evaluating the perceptual quality of MR image reconstructions. Magn Reson Med 2025; 94:317-330. [PMID: 39921580 PMCID: PMC12021552 DOI: 10.1002/mrm.30437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 12/09/2024] [Accepted: 01/04/2025] [Indexed: 02/10/2025]
Abstract
PURPOSE Commonly used MR image quality (IQ) metrics have poor concordance with radiologist-perceived diagnostic IQ. Here, we develop and explore deep feature distances (DFDs)-distances computed in a lower-dimensional feature space encoded by a convolutional neural network (CNN)-as improved perceptual IQ metrics for MR image reconstruction. We further explore the impact of distribution shifts between images in the DFD CNN encoder training data and the IQ metric evaluation. METHODS We compare commonly used IQ metrics (PSNR and SSIM) to two "out-of-domain" DFDs with encoders trained on natural images, an "in-domain" DFD trained on MR images alone, and two domain-adjacent DFDs trained on large medical imaging datasets. We additionally compare these with several state-of-the-art but less commonly reported IQ metrics, visual information fidelity (VIF), noise quality metric (NQM), and the high-frequency error norm (HFEN). IQ metric performance is assessed via correlations with five expert radiologist reader scores of perceived diagnostic IQ of various accelerated MR image reconstructions. We characterize the behavior of these IQ metrics under common distortions expected during image acquisition, including their sensitivity to acquisition noise. RESULTS All DFDs and HFEN correlate more strongly with radiologist-perceived diagnostic IQ than SSIM, PSNR, and other state-of-the-art metrics, with correlations being comparable to radiologist inter-reader variability. Surprisingly, out-of-domain DFDs perform comparably to in-domain and domain-adjacent DFDs. CONCLUSION A suite of IQ metrics, including DFDs and HFEN, should be used alongside commonly-reported IQ metrics for a more holistic evaluation of MR image reconstruction perceptual quality. We also observe that general vision encoders are capable of assessing visual IQ even for MR images.
Collapse
Affiliation(s)
- Philip M. Adamson
- Department of Electrical Engineering, Stanford University, Stanford, California, USA
| | - Arjun D. Desai
- Department of Electrical Engineering, Stanford University, Stanford, California, USA
| | - Jeffrey Dominic
- Department of Electrical Engineering, Stanford University, Stanford, California, USA
| | - Maya Varma
- Department of Computer Science, Stanford University, Stanford, California, USA
| | | | - Jeff P. Wood
- Austin Radiological Association, Austin, Texas, USA
| | - Ali B. Syed
- Department of Radiology, Stanford University, Stanford, California, USA
| | - Robert D. Boutin
- Department of Radiology, Stanford University, Stanford, California, USA
| | | | | | - John M. Pauly
- Department of Electrical Engineering, Stanford University, Stanford, California, USA
| | - Beliz Gunel
- Department of Electrical Engineering, Stanford University, Stanford, California, USA
| | - Akshay S. Chaudhari
- Department of Radiology, Stanford University, Stanford, California, USA
- Department of Biomedical Data Science, Stanford University, Stanford, California, USA
| |
Collapse
|
2
|
Song G, Li K, Wang Z, Liu W, Xue Q, Liang J, Zhou Y, Geng H, Liu D. A fully automatic radiomics pipeline for postoperative facial nerve function prediction of vestibular schwannoma. Neuroscience 2025; 574:124-137. [PMID: 40210197 DOI: 10.1016/j.neuroscience.2025.04.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2025] [Revised: 03/28/2025] [Accepted: 04/05/2025] [Indexed: 04/12/2025]
Abstract
Vestibular schwannoma (VS) is the most prevalent intracranial schwannoma. Surgery is one of the options for the treatment of VS, with the preservation of facial nerve (FN) function being the primary objective. Therefore, postoperative FN function prediction is essential. However, achieving automation for such a method remains a challenge. In this study, we proposed a fully automatic deep learning approach based on multi-sequence magnetic resonance imaging (MRI) to predict FN function after surgery in VS patients. We first developed a segmentation network 2.5D Trans-UNet, which combined Transformer and U-Net to optimize contour segmentation for radiomic feature extraction. Next, we built a deep learning network based on the integration of 1DConvolutional Neural Network (1DCNN) and Gated Recurrent Unit (GRU) to predict postoperative FN function using the extracted features. We trained and tested the 2.5D Trans-UNet segmentation network on public and private datasets, achieving accuracies of 89.51% and 90.66%, respectively, confirming the model's strong performance. Then Feature extraction and selection were performed on the private dataset's segmentation results using 2.5D Trans-UNet. The selected features were used to train the 1DCNN-GRU network for classification. The results showed that our proposed fully automatic radiomics pipeline outperformed the traditional radiomics pipeline on the test set, achieving an accuracy of 88.64%, demonstrating its effectiveness in predicting the postoperative FN function in VS patients. Our proposed automatic method has the potential to become a valuable decision-making tool in neurosurgery, assisting neurosurgeons in making more informed decisions regarding surgical interventions and improving the treatment of VS patients.
Collapse
Affiliation(s)
- Gang Song
- Department of Neurosurgery, Xuanwu Hospital, Capital Medical University, Beijing, China
| | - Keyuan Li
- School of Information Science and Technology, Beijing University of Technology, Beijing, China
| | - Zhuozheng Wang
- School of Information Science and Technology, Beijing University of Technology, Beijing, China
| | - Wei Liu
- School of Information Science and Technology, Beijing University of Technology, Beijing, China.
| | - Qi Xue
- School of Information Science and Technology, Beijing University of Technology, Beijing, China
| | - Jiantao Liang
- Department of Neurosurgery, Xuanwu Hospital, Capital Medical University, Beijing, China
| | - Yiqiang Zhou
- Department of Neurosurgery, Xuanwu Hospital, Capital Medical University, Beijing, China
| | - Haoming Geng
- Department of Neurosurgery, Xuanwu Hospital, Capital Medical University, Beijing, China
| | - Dong Liu
- Department of Neurosurgery, Xuanwu Hospital, Capital Medical University, Beijing, China
| |
Collapse
|
3
|
Wang Y, Wen Z, Bao S, Huang D, Wang Y, Yang B, Li Y, Zhou P, Zhang H, Pang H. Diffusion-CSPAM U-Net: A U-Net model integrated hybrid attention mechanism and diffusion model for segmentation of computed tomography images of brain metastases. Radiat Oncol 2025; 20:50. [PMID: 40188354 PMCID: PMC11971865 DOI: 10.1186/s13014-025-02622-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2024] [Accepted: 03/11/2025] [Indexed: 04/07/2025] Open
Abstract
BACKGROUND Brain metastases are common complications in patients with cancer and significantly affect prognosis and treatment strategies. The accurate segmentation of brain metastases is crucial for effective radiation therapy planning. However, in resource-limited areas, the unavailability of MRI imaging is a significant challenge that necessitates the development of reliable segmentation models for computed tomography images (CT). PURPOSE This study aimed to develop and evaluate a Diffusion-CSPAM-U-Net model for the segmentation of brain metastases on CT images and thereby provide a robust tool for radiation oncologists in regions where magnetic resonance imaging (MRI) is not accessible. METHODS The proposed Diffusion-CSPAM-U-Net model integrates diffusion models with channel-spatial-positional attention mechanisms to enhance the segmentation performance. The model was trained and validated on a dataset consisting of CT images from two centers (n = 205) and (n = 45). Performance metrics, including the Dice similarity coefficient (DSC), intersection over union (IoU), accuracy, sensitivity, and specificity, were calculated. Additionally, this study compared models proposed for brain metastases of different sizes with those proposed in other studies. RESULTS The diffusion-CSPAM-U-Net model achieved promising results on the external validation set. Overall average DSC of 79.3% ± 13.3%, IoU of 69.2% ± 13.3%, accuracy of 95.5% ± 11.8%, sensitivity of 80.3% ± 12.1%, specificity of 93.8% ± 14.0%, and HD of 5.606 ± 0.990 mm were measured. These results demonstrate favorable improvements over existing models. CONCLUSIONS The diffusion-CSPAM-U-Net model showed promising results in segmenting brain metastases in CT images, particularly in terms of sensitivity and accuracy. The proposed diffusion-CSPAM-U-Net model provides an effective tool for radiation oncologists for the segmentation of brain metastases in CT images.
Collapse
Affiliation(s)
- Yiren Wang
- School of Nursing, Southwest Medical University, Luzhou, 646000, China
- Wound Healing Basic Research and Clinical Application Key Laboratory of Luzhou, School of Nursing, Southwest Medical Unversity, Luzhou, 646000, China
| | - Zhongjian Wen
- School of Nursing, Southwest Medical University, Luzhou, 646000, China
- Wound Healing Basic Research and Clinical Application Key Laboratory of Luzhou, School of Nursing, Southwest Medical Unversity, Luzhou, 646000, China
| | - Shuilan Bao
- School of Nursing, Southwest Medical University, Luzhou, 646000, China
- Wound Healing Basic Research and Clinical Application Key Laboratory of Luzhou, School of Nursing, Southwest Medical Unversity, Luzhou, 646000, China
| | - Delong Huang
- Department of Clinical Medicine, Southwest Medical University, Luzhou, 646000, China
| | - Youhua Wang
- Gulin County People's Hospital, Luzhou, 646000, China
| | - Bo Yang
- Department of Oncology, The Affiliated Hospital of Southwest Medical University, No.25 Taiping Street, Jiangyang District, Luzhou, 646000, Sichuan, China
| | - Yunfei Li
- Department of Oncology, The Affiliated Hospital of Southwest Medical University, No.25 Taiping Street, Jiangyang District, Luzhou, 646000, Sichuan, China
| | - Ping Zhou
- Wound Healing Basic Research and Clinical Application Key Laboratory of Luzhou, School of Nursing, Southwest Medical Unversity, Luzhou, 646000, China.
- Department of Radiology, The Affiliated Hospital of Southwest Medical University, No.25 Taiping Street, Jiangyang District, Luzhou, 646000, Sichuan, China.
| | - Huaiwen Zhang
- Department of Radiotherapy, Jiangxi Cancer Hospital, The Second Affiliated Hospital of Nanchang Medical College, Jiangxi Clinical Research Center for Cancer, No.519 Beijing East Road, Donghu District, Nanchang, 330029, Jiangxi, China.
| | - Haowen Pang
- Department of Oncology, The Affiliated Hospital of Southwest Medical University, No.25 Taiping Street, Jiangyang District, Luzhou, 646000, Sichuan, China.
| |
Collapse
|
4
|
Lee J, Kim D, Kim T, Al-Masni MA, Han Y, Kim DH, Ryu K. Meta-learning guidance for robust medical image synthesis: Addressing the real-world misalignment and corruptions. Comput Med Imaging Graph 2025; 121:102506. [PMID: 39914125 DOI: 10.1016/j.compmedimag.2025.102506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 01/12/2025] [Accepted: 01/29/2025] [Indexed: 03/03/2025]
Abstract
Deep learning-based image synthesis for medical imaging is currently an active research topic with various clinically relevant applications. Recently, methods allowing training with misaligned data have started to emerge, yet current solution lack robustness and cannot handle other corruptions in the dataset. In this work, we propose a solution to this problem for training synthesis network for datasets affected by mis-registration, artifacts, and deformations. Our proposed method consists of three key innovations: meta-learning inspired re-weighting scheme to directly decrease the influence of corrupted instances in a mini-batch by assigning lower weights in the loss function, non-local feature-based loss function, and joint training of image synthesis network together with spatial transformer (STN)-based registration networks with specially designed regularization. Efficacy of our method is validated in a controlled synthetic scenario, as well as public dataset with such corruptions. This work introduces a new framework that may be applicable to challenging scenarios and other more difficult datasets.
Collapse
Affiliation(s)
- Jaehun Lee
- Intelligence and Interaction Research Center, Korea Institute of Science and Technology, Seoul, Republic of Korea; Department of Electrical and Electronic Engineering, College of Engineering, Yonsei University, Seoul, Republic of Korea.
| | - Daniel Kim
- Department of Electrical and Electronic Engineering, College of Engineering, Yonsei University, Seoul, Republic of Korea.
| | - Taehun Kim
- Intelligence and Interaction Research Center, Korea Institute of Science and Technology, Seoul, Republic of Korea.
| | - Mohammed A Al-Masni
- Department of Artificial Intelligence, Sejong University, Seoul, Republic of Korea.
| | - Yoseob Han
- Department of Intelligent Semiconductors, Soongsil University, Seoul, Republic of Korea.
| | - Dong-Hyun Kim
- Department of Electrical and Electronic Engineering, College of Engineering, Yonsei University, Seoul, Republic of Korea.
| | - Kanghyun Ryu
- Intelligence and Interaction Research Center, Korea Institute of Science and Technology, Seoul, Republic of Korea.
| |
Collapse
|
5
|
Ahmed S, Jinchao F, Manan MA, Yaqub M, Ali MU, Raheem A. FedGraphMRI-net: A federated graph neural network framework for robust MRI reconstruction across non-IID data. Biomed Signal Process Control 2025; 102:107360. [DOI: 10.1016/j.bspc.2024.107360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2025]
|
6
|
Chen S, Zhang R, Liang H, Qian Y, Zhou X. Coupling of state space modules and attention mechanisms: An input-aware multi-contrast MRI synthesis method. Med Phys 2025; 52:2269-2278. [PMID: 39714363 DOI: 10.1002/mp.17598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2024] [Revised: 11/19/2024] [Accepted: 12/05/2024] [Indexed: 12/24/2024] Open
Abstract
BACKGROUND Medical imaging plays a pivotal role in the real-time monitoring of patients during the diagnostic and therapeutic processes. However, in clinical scenarios, the acquisition of multi-modal imaging protocols is often impeded by a number of factors, including time and economic costs, the cooperation willingness of patients, imaging quality, and even safety concerns. PURPOSE We proposed a learning-based medical image synthesis method to simplify the acquisition of multi-contrast MRI. METHODS We redesigned the basic structure of the Mamba block and explored different integration patterns between Mamba layers and Transformer layers to make it more suitable for medical image synthesis tasks. Experiments were conducted on the IXI (a total of 575 samples, training set: 450 samples; validation set: 25 samples; test set: 100 samples) and BRATS (a total of 494 samples, training set: 350 samples; validation set: 44 samples; test set: 100 samples) datasets to assess the synthesis performance of our proposed method in comparison to some state-of-the-art models on the task of multi-contrast MRI synthesis. RESULTS Our proposed model outperformed other state-of-the-art models in some multi-contrast MRI synthesis tasks. In the synthesis task from T1 to PD, our proposed method achieved the peak signal-to-noise ratio (PSNR) of 33.70 dB (95% CI, 33.61, 33.79) and the structural similarity index (SSIM) of 0.966 (95% CI, 0.964, 0.968). In the synthesis task from T2 to PD, the model achieved a PSNR of 33.90 dB (95% CI, 33.82, 33.98) and SSMI of 0.971 (95% CI, 0.969, 0.973). In the synthesis task from FLAIR to T2, the model achieved PSNR of 30.43 dB (95% CI, 30.29, 30.57) and SSIM of 0.938 (95% CI, 0.935, 0.941). CONCLUSIONS Our proposed method could effectively model not only the high-dimensional, nonlinear mapping relationships between the magnetic signals of the hydrogen nucleus in tissues and the proton density signals in tissues, but also of the recovery process of suppressed liquid signals in FLAIR. The model proposed in our work employed distinct mechanisms in the synthesis of images belonging to normal and lesion samples, which demonstrated that our model had a profound comprehension of the input data. We also proved that in a hierarchical network, only the deeper self-attention layers were responsible for directing more attention on lesion areas.
Collapse
Affiliation(s)
- Shuai Chen
- Jiangsu Key Laboratory for Biomaterials and Devices, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Ruoyu Zhang
- Jiangsu Key Laboratory for Biomaterials and Devices, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Huazheng Liang
- Monash Suzhou Research Institute, Suzhou, Jiangsu Province, China
- Shanghai Key Laboratory of Anesthesiology and Brain Functional Modulation, Clinical Research Center for Anesthesiology and Perioperative Medicine, Shanghai Fourth People's Hospital, School of Medicine, Translational Research Institute of Brain and Brain-Like Intelligence, Shanghai Fourth People's Hospital, School of Medicine, Tongji University, Shanghai, China
| | - Yunzhu Qian
- Department of Stomatology, The Fourth Affiliated Hospital of Soochow University, Suzhou Dushu Lake Hospital, Medical Center of Soochow University, Suzhou, Jiangsu Province, China
| | - Xuefeng Zhou
- Jiangsu Key Laboratory for Biomaterials and Devices, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| |
Collapse
|
7
|
Zhang J, Zeng X. M2OCNN: Many-to-One Collaboration Neural Networks for simultaneously multi-modal medical image synthesis and fusion. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 261:108612. [PMID: 39908634 DOI: 10.1016/j.cmpb.2025.108612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2024] [Revised: 01/10/2025] [Accepted: 01/19/2025] [Indexed: 02/07/2025]
Abstract
BACKGROUND AND OBJECTIVE Acquiring comprehensive information from multi-modal medical images remains a challenge in clinical diagnostics and treatment, due to complex inter-modal dependencies and missing modalities. While cross-modal medical image synthesis (CMIS) and multi-modal medical image fusion (MMIF) address certain issues, existing methods typically treat these as separate tasks, lacking a unified framework that can generate both synthesized and fused images in the presence of missing modalities. METHODS In this paper, we propose the Many-to-One Collaboration Neural Network (M2OCNN), a unified model designed to simultaneously address CMIS and MMIF. Unlike traditional approaches, M2OCNN treats fusion as a specific form of synthesis and provides a comprehensive solution even when modalities are missing. The network consists of three modules: the Parallel Untangling Hybrid Network, Comprehensive Feature Router, and Series Omni-modal Hybrid Network. Additionally, we introduce a mixed-resolution attention mechanism and two transformer variants, Coarsormer and ReCoarsormer, to suppress high-frequency interference and enhance model performance. M2OCNN outperformed state-of-the-art methods on three multi-modal medical imaging datasets, achieving an average PSNR improvement of 2.4 dB in synthesis tasks and producing high-quality fusion images despite missing modalities. The source code is available at https://github.com/zjno108/M2OCNN. CONCLUSION M2OCNN offers a novel solution by unifying CMIS and MMIF tasks in a single framework, enabling the generation of both synthesized and fused images from a single modality. This approach sets a new direction for research in multi-modal medical imaging, with implications for improving clinical diagnosis and treatment.
Collapse
Affiliation(s)
- Jian Zhang
- Chongqing Key Laboratory of Image Cognition, College of Computer Science and Technology, Chongqing University of Posts and Telecommunication, Chongqing, 400065, China.
| | - Xianhua Zeng
- Chongqing Key Laboratory of Image Cognition, College of Computer Science and Technology, Chongqing University of Posts and Telecommunication, Chongqing, 400065, China.
| |
Collapse
|
8
|
Wu M, Zhang L, Yap PT, Zhu H, Liu M. Disentangled latent energy-based style translation: An image-level structural MRI harmonization framework. Neural Netw 2025; 184:107039. [PMID: 39700825 PMCID: PMC11802304 DOI: 10.1016/j.neunet.2024.107039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Revised: 10/17/2024] [Accepted: 12/07/2024] [Indexed: 12/21/2024]
Abstract
Brain magnetic resonance imaging (MRI) has been extensively employed across clinical and research fields, but often exhibits sensitivity to site effects arising from non-biological variations such as differences in field strength and scanner vendors. Numerous retrospective MRI harmonization techniques have demonstrated encouraging outcomes in reducing the site effects at image level. However, existing methods generally suffer from high computational requirements and limited generalizability, restricting their applicability to unseen MRIs. In this paper, we design a novel disentangled latent energy-based style translation (DLEST) framework for unpaired image-level MRI harmonization, consisting of (a) site-invariant image generation (SIG), (b) site-specific style translation (SST), and (c) site-specific MRI synthesis (SMS). Specifically, the SIG employs a latent autoencoder to encode MRIs into a low-dimensional latent space and reconstruct MRIs from latent codes. The SST utilizes an energy-based model to comprehend global latent distribution of a target domain and translate source latent codes towards the target domain, while SMS enables MRI synthesis with a target-specific style. By disentangling image generation and style translation in latent space, the DLEST can achieve efficient style translation. Our model was trained on T1-weighted MRIs from a public dataset (with 3,984 subjects across 58 acquisition sites/settings) and validated on an independent dataset (with 9 traveling subjects scanned in 11 sites/settings) in four tasks: histogram and feature visualization, site classification, brain tissue segmentation, and site-specific structural MRI synthesis. Qualitative and quantitative results demonstrate the superiority of our method over several state-of-the-arts.
Collapse
Affiliation(s)
- Mengqi Wu
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA; Joint Department of Biomedical Engineering, University of North Carolina at Chapel Hill and North Carolina State University, Chapel Hill, NC 27599, USA
| | - Lintao Zhang
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Pew-Thian Yap
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Hongtu Zhu
- Department of Biostatistics and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Mingxia Liu
- Department of Radiology and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
| |
Collapse
|
9
|
Cao B, Qi G, Zhao J, Zhu P, Hu Q, Gao X. RTF: Recursive TransFusion for Multi-Modal Image Synthesis. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; 34:1573-1587. [PMID: 40031796 DOI: 10.1109/tip.2025.3541877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Multi-modal image synthesis is crucial for obtaining complete modalities due to the imaging restrictions in reality. Current methods, primarily CNN-based models, find it challenging to extract global representations because of local inductive bias, leading to synthetic structure deformation or color distortion. Despite the significant global representation ability of transformer in capturing long-range dependencies, its huge parameter size requires considerable training data. Multi-modal synthesis solely based on one of the two structures makes it hard to extract comprehensive information from each modality with limited data. To tackle this dilemma, we propose a simple yet effective Recursive TransFusion (RTF) framework for multi-modal image synthesis. Specifically, we develop a TransFusion unit to integrate local knowledge extracted from the individual modality by connecting a CNN-based local representation block (LRB) and a transformer-based global fusion block (GFB) via a feature translating gate (FTG). Considering the numerous parameters introduced by the transformer, we further unfold a TransFusion unit with recursive constraint repeatedly, forming recursive TransFusion (RTF), which progressively extracts multi-modal information at different depths. Our RTF remarkably reduces network parameters while maintaining superior performance. Extensive experiments validate our superiority against the competing methods on multiple benchmarks. The source code will be available at https://github.com/guoliangq/RTF.
Collapse
|
10
|
D N S, Pai RM, Bhat SN, Pai M M M. Assessment of perceived realism in AI-generated synthetic spine fracture CT images. Technol Health Care 2025; 33:931-944. [PMID: 40105176 DOI: 10.1177/09287329241291368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/20/2025]
Abstract
BackgroundDeep learning-based decision support systems require synthetic images generated by adversarial networks, which require clinical evaluation to ensure their quality.ObjectiveThe study evaluates perceived realism of high-dimension synthetic spine fracture CT images generated Progressive Growing Generative Adversarial Networks (PGGANs).Method: The study used 2820 spine fracture CT images from 456 patients to train an PGGAN model. The model synthesized images up to 512 × 512 pixels, and the realism of the generated images was assessed using Visual Turing Tests and Fracture Identification Test. Three spine surgeons evaluated the images, and clinical evaluation results were statistically analysed.Result: Spine surgeons have an average prediction accuracy of nearly 50% during clinical evaluations, indicating difficulty in distinguishing between real and generated images. The accuracy varies for different dimensions, with synthetic images being more realistic, especially in 512 × 512-dimension images. During FIT, among 16 generated images of each fracture type, 13-15 images were correctly identified, indicating images are more realistic and clearly depict fracture lines in 512 × 512 dimensions.ConclusionThe study reveals that AI-based PGGAN can generate realistic synthetic spine fracture CT images up to 512 × 512 pixels, making them difficult to distinguish from real images, and improving the automatic spine fracture type detection system.
Collapse
Affiliation(s)
- Sindhura D N
- Department of Data Science and Computer Applications, Manipal Institute of Technology, Manipal, Manipal Academy of Higher Education, Manipal, India
| | - Radhika M Pai
- Department of Data Science and Computer Applications, Manipal Institute of Technology, Manipal, Manipal Academy of Higher Education, Manipal, India
| | - Shyamasunder N Bhat
- Department of Orthopaedics, Kasturba Medical College, Manipal, Manipal Academy of Higher Education, Manipal, India
| | - Manohara Pai M M
- Department of Information and Communication Technology, Manipal Institute of Technology, Manipal, Manipal Academy of Higher Education, Manipal, India
| |
Collapse
|
11
|
Yu X, Hu D, Yao Q, Fu Y, Zhong Y, Wang J, Tian M, Zhang H. Diffused Multi-scale Generative Adversarial Network for low-dose PET images reconstruction. Biomed Eng Online 2025; 24:16. [PMID: 39924498 PMCID: PMC11807330 DOI: 10.1186/s12938-025-01348-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2024] [Accepted: 01/29/2025] [Indexed: 02/11/2025] Open
Abstract
PURPOSE The aim of this study is to convert low-dose PET (L-PET) images to full-dose PET (F-PET) images based on our Diffused Multi-scale Generative Adversarial Network (DMGAN) to offer a potential balance between reducing radiation exposure and maintaining diagnostic performance. METHODS The proposed method includes two modules: the diffusion generator and the u-net discriminator. The goal of the first module is to get different information from different levels, enhancing the generalization ability of the generator to the image and improving the stability of the training. Generated images are inputted into the u-net discriminator, extracting details from both overall and specific perspectives to enhance the quality of the generated F-PET images. We conducted evaluations encompassing both qualitative assessments and quantitative measures. In terms of quantitative comparisons, we employed two metrics, structure similarity index measure (SSIM) and peak signal-to-noise ratio (PSNR) to evaluate the performance of diverse methods. RESULTS Our proposed method achieved the highest PSNR and SSIM scores among the compared methods, which improved PSNR by at least 6.2% compared to the other methods. Compared to other methods, the synthesized full-dose PET image generated by our method exhibits a more accurate voxel-wise metabolic intensity distribution, resulting in a clearer depiction of the epilepsy focus. CONCLUSIONS The proposed method demonstrates improved restoration of original details from low-dose PET images compared to other models trained on the same datasets. This method offers a potential balance between minimizing radiation exposure and preserving diagnostic performance.
Collapse
Affiliation(s)
- Xiang Yu
- Polytechnic Institute, Zhejiang University, Hangzhou, China
| | - Daoyan Hu
- The College of Biomedical Engineering and Instrument Science of Zhejiang University, Hangzhou, China
| | - Qiong Yao
- Department of Nuclear Medicine and Medical PET Center, The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, 310009, China
| | - Yu Fu
- College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou, China
| | - Yan Zhong
- Department of Nuclear Medicine and Medical PET Center, The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, 310009, China
| | - Jing Wang
- Department of Nuclear Medicine and Medical PET Center, The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, 310009, China
| | - Mei Tian
- Human Phenome Institute, Fudan University, 825 Zhangheng Road, Shanghai, 201203, China.
| | - Hong Zhang
- The College of Biomedical Engineering and Instrument Science of Zhejiang University, Hangzhou, China.
- Department of Nuclear Medicine and Medical PET Center, The Second Affiliated Hospital of Zhejiang University School of Medicine, Hangzhou, 310009, China.
- Key Laboratory for Biomedical Engineering of Ministry of Education, Zhejiang University, Hangzhou, China.
| |
Collapse
|
12
|
Zhang T, Pang H, Wu Y, Xu J, Liu L, Li S, Xia S, Chen R, Liang Z, Qi S. BreathVisionNet: A pulmonary-function-guided CNN-transformer hybrid model for expiratory CT image synthesis. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 259:108516. [PMID: 39571504 DOI: 10.1016/j.cmpb.2024.108516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Revised: 10/15/2024] [Accepted: 11/13/2024] [Indexed: 12/11/2024]
Abstract
BACKGROUND AND OBJECTIVE Chronic obstructive pulmonary disease (COPD) has high heterogeneity in etiologies and clinical manifestations. Expiratory Computed tomography (CT) can effectively assess air trapping, aiding in disease diagnosis. However, due to concerns about radiation exposure and cost, expiratory CT is not routinely performed. Recent work on synthesizing expiratory CT has primarily focused on imaging features while neglecting patient-specific pulmonary function. METHODS To address these issues, we developed a novel model named BreathVisionNet that incorporates pulmonary function data to guide the synthesis of expiratory CT from inspiratory CT. An architecture combining a convolutional neural network and transformer is introduced to leverage the irregular phenotypic distribution in COPD patients. The model can better understand the long-range and global contexts by incorporating global information into the encoder. The utilization of edge information and multi-view data further enhances the quality of the synthesized CT. Parametric response mapping (PRM) can be estimated by using synthesized expiratory CT and inspiratory CT to quantify COPD phenotypes of the normal, emphysema, and functional small airway disease (fSAD), including their percentages, spatial distributions, and voxel distribution maps. RESULTS BreathVisionNet outperforms other generative models in terms of synthesized image quality. It achieves a mean absolute error, normalized mean square error, structural similarity index and peak signal-to-noise ratio of 78.207 HU, 0.643, 0.847 and 25.828 dB, respectively. Comparing the predicted and real PRM, the Dice coefficient can reach 0.732 (emphysema) and 0.560 (fSAD). The mean of differences between true and predicted fSAD percentage is 4.42 for the development dataset (low radiation dose CT scans), and 9.05 for an independent external validation dataset (routine dose), indicating that model has great generalizability. A classifier trained on voxel distribution maps can achieve an accuracy of 0.891 in predicting the presence of COPD. CONCLUSIONS BreathVisionNet can accurately synthesize expiratory CT images from inspiratory CT and predict their voxel distribution. The estimated PRM can help to quantify COPD phenotypes of the normal, emphysema, and fSAD. This capability provides additional insights into COPD diversity while only inspiratory CT images are available.
Collapse
Affiliation(s)
- Tiande Zhang
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China
| | - Haowen Pang
- School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing, China
| | - Yanan Wu
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Jiaxuan Xu
- State Key Laboratory of Respiratory Disease, National Clinical Research Center for Respiratory Disease, Guangzhou Institute of Respiratory Health, The National Center for Respiratory Medicine, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Lingkai Liu
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China
| | - Shang Li
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China
| | - Shuyue Xia
- Department of Respiratory and Critical Care Medicine, Central Hospital Affiliated to Shenyang Medical College, Shenyang, China
| | - Rongchang Chen
- State Key Laboratory of Respiratory Disease, National Clinical Research Center for Respiratory Disease, Guangzhou Institute of Respiratory Health, The National Center for Respiratory Medicine, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China; Hetao Institute of Guangzhou National Laboratory, Guangzhou China
| | - Zhenyu Liang
- State Key Laboratory of Respiratory Disease, National Clinical Research Center for Respiratory Disease, Guangzhou Institute of Respiratory Health, The National Center for Respiratory Medicine, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, China.
| | - Shouliang Qi
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, China; Department of Respiratory and Critical Care Medicine, Central Hospital Affiliated to Shenyang Medical College, Shenyang, China.
| |
Collapse
|
13
|
Patil SS, Rajak R, Ramteke M, Rathore AS. MMIT-DDPM - Multilateral medical image translation with class and structure supervised diffusion-based model. Comput Biol Med 2025; 185:109501. [PMID: 39626456 DOI: 10.1016/j.compbiomed.2024.109501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 10/03/2024] [Accepted: 11/27/2024] [Indexed: 01/26/2025]
Abstract
Unified translation of medical images from one-to-many distinct modalities is desirable in healthcare settings. A ubiquitous approach for bilateral medical scan translation is one-to-one mapping with GANs. However, its efficacy in encapsulating diversity in a pool of medical scans and performing one-to-many translation is questionable. In contrast, the Denoising Diffusion Probabilistic Model (DDPM) exhibits exceptional ability in image generation due to its scalability and ability to capture the distribution of whole training data. Therefore, we propose a novel conditioning mechanism for the deterministic translation of medical scans to any target modality from a source modality with a DDPM model. This model denoises the target modality under the guidance of a source-modality structure encoder and source-to-target class conditioner. Consequently, this mechanism serves as prior information for sampling the desired target modality during inference. The training and testing have been carried out on the T1-weighted, T2-weighted, and Fluid Attenuated Inversion Recovery (FLAIR) sequence of the BraTS 2021 dataset. The proposed model is capable of unified multi-lateral translation among six combinations of T1ce, T2, and FLAIR sequences of brain MRI, eliminating the need for multiple bilateral translation models. We have analyzed the performance of our architecture against State-of-the-art, Convolution, and Transformer-based GANs. The diffusion model efficiently covers the distribution of multiple modalities while producing better image quality of the translated sequences, as evidenced by the average improvement of 8.06 % in Multi-Scale Structural Similarity (MSSIM) and 2.52 in Fréchet Inception Distance (FID) metrics compared with the CNN and transformer-based GAN architecture.
Collapse
Affiliation(s)
| | - Rishav Rajak
- Department of Chemical Engineering, IIT Delhi, India
| | - Manojkumar Ramteke
- Department of Chemical Engineering, IIT Delhi, India; Yardi School of Artificial Intelligence, IIT Delhi, India
| | - Anurag S Rathore
- Department of Chemical Engineering, IIT Delhi, India; Yardi School of Artificial Intelligence, IIT Delhi, India.
| |
Collapse
|
14
|
Zhou M, Wagner MW, Tabori U, Hawkins C, Ertl-Wagner BB, Khalvati F. Generating 3D brain tumor regions in MRI using vector-quantization Generative Adversarial Networks. Comput Biol Med 2025; 185:109502. [PMID: 39700855 DOI: 10.1016/j.compbiomed.2024.109502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 05/03/2024] [Accepted: 11/27/2024] [Indexed: 12/21/2024]
Abstract
Medical image analysis has significantly benefited from advancements in deep learning, particularly in the application of Generative Adversarial Networks (GANs) for generating realistic and diverse images that can augment training datasets. The common GAN-based approach is to generate entire image volumes, rather than the region of interest (ROI). Research on deep learning-based brain tumor classification using MRI has shown that it is easier to classify the tumor ROIs compared to the entire image volumes. In this work, we present a novel framework that uses vector-quantization GAN and a transformer incorporating masked token modeling to generate high-resolution and diverse 3D brain tumor ROIs that can be used as additional data for tumor ROI classification. We apply our method to two imbalanced datasets where we augment the minority class: (1) low-grade glioma (LGG) ROIs from the Multimodal Brain Tumor Segmentation Challenge (BraTS) 2019 dataset; (2) BRAF V600E Mutation genetic marker tumor ROIs from the internal pediatric LGG (pLGG) dataset. We show that the proposed method outperforms various baseline models qualitatively and quantitatively. The generated data was used to balance the data to classify brain tumor types. Our approach demonstrates superior performance, surpassing baseline models by 6.4% in the area under the ROC curve (AUC) on the BraTS 2019 dataset and 4.3% in the AUC on the internal pLGG dataset. The results indicate the generated tumor ROIs can effectively address the imbalanced data problem. Our proposed method has the potential to facilitate an accurate diagnosis of rare brain tumors using MRI scans.
Collapse
Affiliation(s)
- Meng Zhou
- Department of Computer Science, University of Toronto, 40 St George St., Toronto, M5S 2E4, ON, Canada; Neurosciences & Mental Health Research Program, The Hospital for Sick Children, 686 Bay St., Toronto, M5G 0A4, ON, Canada.
| | - Matthias W Wagner
- Department of Diagnostic and Interventional Radiology, The Hospital for Sick Children, 170 Elizabeth St., Toronto, M5G 1H3, ON, Canada; Institute of Diagnostic and Interventional Neuroradiology, University Hospital Augsburg, Stenglinstraße 2, Augsburg, 86156, Germany
| | - Uri Tabori
- Division of Neuroradiology, Neurooncology, The Hospital for Sick Children, 170 Elizabeth St., Toronto, M5G 1H3, ON, Canada
| | - Cynthia Hawkins
- Paediatric Laboratory Medicine, Division of Pathology, The Hospital for Sick Children, 170 Elizabeth St., Toronto, M5G 1H3, ON, Canada
| | - Birgit B Ertl-Wagner
- Neurosciences & Mental Health Research Program, The Hospital for Sick Children, 686 Bay St., Toronto, M5G 0A4, ON, Canada; Department of Diagnostic and Interventional Radiology, The Hospital for Sick Children, 170 Elizabeth St., Toronto, M5G 1H3, ON, Canada; Institute of Medical Science, University of Toronto, 1 King's College Circle, Toronto, M5S 1A8, ON, Canada; Department of Medical Imaging, University of Toronto, 263 McCaul St., Toronto, M5T 1W7, ON, Canada
| | - Farzad Khalvati
- Department of Computer Science, University of Toronto, 40 St George St., Toronto, M5S 2E4, ON, Canada; Neurosciences & Mental Health Research Program, The Hospital for Sick Children, 686 Bay St., Toronto, M5G 0A4, ON, Canada; Department of Diagnostic and Interventional Radiology, The Hospital for Sick Children, 170 Elizabeth St., Toronto, M5G 1H3, ON, Canada; Institute of Medical Science, University of Toronto, 1 King's College Circle, Toronto, M5S 1A8, ON, Canada; Department of Medical Imaging, University of Toronto, 263 McCaul St., Toronto, M5T 1W7, ON, Canada; Department of Mechanical and Industrial Engineering, University of Toronto, 5 King's College Road, Toronto, M5S 3G8, ON, Canada.
| |
Collapse
|
15
|
Han H, Tian Z, Guo Q, Jiang J, Du S, Wang J. HSC-T: B-Ultrasound-to-Elastography Translation via Hierarchical Structural Consistency Learning for Thyroid Cancer Diagnosis. IEEE J Biomed Health Inform 2025; 29:799-806. [PMID: 39495688 DOI: 10.1109/jbhi.2024.3491905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2024]
Abstract
Elastography ultrasound imaging is increasingly important in the diagnosis of thyroid cancer and other diseases, but its reliance on specialized equipment and techniques limits widespread adoption. This paper proposes a novel multimodal ultrasound diagnostic pipeline that expands the application of elastography ultrasound by translating B-ultrasound (BUS) images into elastography images (EUS). Additionally, to address the limitations of existing image-to-image translation methods, which struggle to effectively model inter-sample variations and accurately capture regional-scale structural consistency, we propose a BUS-to-EUS translation method based on hierarchical structural consistency. By incorporating domain-level, sample-level, patch-level, and pixel-level constraints, our approach guides the model in learning a more precise mapping from BUS to EUS, thereby enhancing diagnostic accuracy. Experimental results demonstrate that the proposed method significantly improves the accuracy of BUS-to-EUS translation on the MTUSI dataset and that the generated elastography images enhance nodule diagnostic accuracy compared to solely using BUS images on the STUSI and the BUSI datasets. This advancement highlights the potential for broader application of elastography in clinical practice.
Collapse
|
16
|
Dong G, He Y, Liu X, Dai J, Xie Y, Liang X. Better Cone-Beam CT Artifact Correction via Spatial and Channel Reconstruction Convolution Based on Unsupervised Adversarial Diffusion Models. Bioengineering (Basel) 2025; 12:132. [PMID: 40001652 PMCID: PMC11851389 DOI: 10.3390/bioengineering12020132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2024] [Revised: 01/05/2025] [Accepted: 01/14/2025] [Indexed: 02/27/2025] Open
Abstract
Cone-Beam Computed Tomography (CBCT) holds significant clinical value in image-guided radiotherapy (IGRT). However, CBCT images of low-density soft tissues are often plagued with artifacts and noise, which can lead to missed diagnoses and misdiagnoses. We propose a new unsupervised CBCT image artifact correction algorithm, named Spatial Convolution Diffusion (ScDiff), based on a conditional diffusion model, which combines the unsupervised learning ability of generative adaptive networks (GAN) with the stable training characteristics of diffusion models. This approach can efficiently and stably achieve CBCT image artifact correction, resulting in clear, realistic CBCT images with complete anatomical structures. The proposed model can effectively improve the image quality of CBCT. The obtained results can reduce artifacts while preserving the anatomical structure of CBCT images. We compared the proposed method with several GAN- and diffusion-based methods. Our method achieved the highest corrected image quality and the best evaluation metrics.
Collapse
Affiliation(s)
- Guoya Dong
- Hebei Key Laboratory of Bioelectromagnetics and Neural Engineering, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin 300130, China; (G.D.); (Y.H.)
| | - Yutong He
- Hebei Key Laboratory of Bioelectromagnetics and Neural Engineering, School of Health Sciences and Biomedical Engineering, Hebei University of Technology, Tianjin 300130, China; (G.D.); (Y.H.)
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; (X.L.); (J.D.); (Y.X.)
| | - Xuan Liu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; (X.L.); (J.D.); (Y.X.)
| | - Jingjing Dai
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; (X.L.); (J.D.); (Y.X.)
| | - Yaoqin Xie
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; (X.L.); (J.D.); (Y.X.)
| | - Xiaokun Liang
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; (X.L.); (J.D.); (Y.X.)
| |
Collapse
|
17
|
Ahmed S, Feng J, Ferzund J, Yaqub M, Ali MU, Manan MA, Raheem A. FAME: A Federated Adversarial Learning Framework for Privacy-Preserving MRI Reconstruction. APPLIED MAGNETIC RESONANCE 2025. [DOI: 10.1007/s00723-025-01749-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2024] [Revised: 11/30/2024] [Accepted: 01/08/2025] [Indexed: 05/04/2025]
|
18
|
Chen Z, Bian Y, Shen E, Fan L, Zhu W, Shi F, Shao C, Chen X, Xiang D. Moment-Consistent Contrastive CycleGAN for Cross-Domain Pancreatic Image Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:422-435. [PMID: 39167524 DOI: 10.1109/tmi.2024.3447071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/23/2024]
Abstract
CT and MR are currently the most common imaging techniques for pancreatic cancer diagnosis. Accurate segmentation of the pancreas in CT and MR images can provide significant help in the diagnosis and treatment of pancreatic cancer. Traditional supervised segmentation methods require a large number of labeled CT and MR training data, which is usually time-consuming and laborious. Meanwhile, due to domain shift, traditional segmentation networks are difficult to be deployed on different imaging modality datasets. Cross-domain segmentation can utilize labeled source domain data to assist unlabeled target domains in solving the above problems. In this paper, a cross-domain pancreas segmentation algorithm is proposed based on Moment-Consistent Contrastive Cycle Generative Adversarial Networks (MC-CCycleGAN). MC-CCycleGAN is a style transfer network, in which the encoder of its generator is used to extract features from real images and style transfer images, constrain feature extraction through a contrastive loss, and fully extract structural features of input images during style transfer while eliminate redundant style features. The multi-order central moments of the pancreas are proposed to describe its anatomy in high dimensions and a contrastive loss is also proposed to constrain the moment consistency, so as to maintain consistency of the pancreatic structure and shape before and after style transfer. Multi-teacher knowledge distillation framework is proposed to transfer the knowledge from multiple teachers to a single student, so as to improve the robustness and performance of the student network. The experimental results have demonstrated the superiority of our framework over state-of-the-art domain adaptation methods.
Collapse
|
19
|
Feng Y, Deng S, Lyu J, Cai J, Wei M, Qin J. Bridging MRI Cross-Modality Synthesis and Multi-Contrast Super-Resolution by Fine-Grained Difference Learning. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:373-383. [PMID: 39159018 DOI: 10.1109/tmi.2024.3445969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/21/2024]
Abstract
In multi-modal magnetic resonance imaging (MRI), the tasks of imputing or reconstructing the target modality share a common obstacle: the accurate modeling of fine-grained inter-modal differences, which has been sparingly addressed in current literature. These differences stem from two sources: 1) spatial misalignment remaining after coarse registration and 2) structural distinction arising from modality-specific signal manifestations. This paper integrates the previously separate research trajectories of cross-modality synthesis (CMS) and multi-contrast super-resolution (MCSR) to address this pervasive challenge within a unified framework. Connected through generalized down-sampling ratios, this unification not only emphasizes their common goal in reducing structural differences, but also identifies the key task distinguishing MCSR from CMS: modeling the structural distinctions using the limited information from the misaligned target input. Specifically, we propose a composite network architecture with several key components: a label correction module to align the coordinates of multi-modal training pairs, a CMS module serving as the base model, an SR branch to handle target inputs, and a difference projection discriminator for structural distinction-centered adversarial training. When training the SR branch as the generator, the adversarial learning is enhanced with distinction-aware incremental modulation to ensure better-controlled generation. Moreover, the SR branch integrates deformable convolutions to address cross-modal spatial misalignment at the feature level. Experiments conducted on three public datasets demonstrate that our approach effectively balances structural accuracy and realism, exhibiting overall superiority in comprehensive evaluations for both tasks over current state-of-the-art approaches. The code is available at https://github.com/papshare/FGDL.
Collapse
|
20
|
Chang C, Yao L, Zhao X. A weakly supervised model for incomplete multimodal MRI synthesis with tumor-aware approach. Med Phys 2025; 52:362-374. [PMID: 39432673 DOI: 10.1002/mp.17443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Revised: 08/14/2024] [Accepted: 09/18/2024] [Indexed: 10/23/2024] Open
Abstract
BACKGROUND Magnetic resonance images (MRIs) are a valuable tool in the study of brain tumors, and multimodal sequences provide unique insights into different aspects of brain tumors. However, in clinical practice, missing modalities are often encountered due to various factors. This makes it difficult to obtain comprehensive and reliable information related to brain tumors. PURPOSE The purpose of this work is to develop an algorithm for the synthesis of missing MRI modality with high precision, and to center on generating accurate tumor-related information to offer more data for clinical diagnosis. METHODS A novel weakly supervised MRI synthesis model named TAM-DAM-GAN has been proposed, which integrates tumor-aware and detail adjustment mechanisms to enhance the quality of tumor generation. The tumor-aware mechanism leverages weak label information to guide the network to classify images based on crucial information in local structures, thereby compelling the generative network to identify and highlight the learning of local tumor regions. The detail adjustment mechanism utilizes a discriminator to create attention maps at the pixel level in real-time. These maps are then used to modify the loss weight, which in turn adjusts the details that are generated. RESULTS Generation quality of four tasks (T1-to-T2, T2-to-T1, T1-to-FLAIR, and FLAIR-to-T1) was evaluated. Experiments on the BRATS2015 dataset show that the proposed approach is superior in both qualitative and quantitative measures. Taking FLAIR-to-T1 as an example, TAM-DAM-GAN improves PSNR of tumor region from 18.556 to 20.576 compared to baseline. Also, using real FLAIR data with generated T1 data boosts tumor segmentation accuracy by 10% compared to using only real FLAIR data. CONCLUSION This finding will be conducive to enhancing the accuracy of cross-modality synthesis in incomplete multimodal MRI, especially for tumor regions, thereby providing more dependable and comprehensive data for clinical diagnosis and scientific research.
Collapse
Affiliation(s)
- Can Chang
- School of Artificial Intelligence, Beijing Normal University, Beijing, China
| | - Li Yao
- School of Artificial Intelligence, Beijing Normal University, Beijing, China
| | - Xiaojie Zhao
- School of Artificial Intelligence, Beijing Normal University, Beijing, China
| |
Collapse
|
21
|
Zhang Y, Peng C, Wang Q, Song D, Li K, Kevin Zhou S. Unified Multi-Modal Image Synthesis for Missing Modality Imputation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:4-18. [PMID: 38976465 DOI: 10.1109/tmi.2024.3424785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Multi-modal medical images provide complementary soft-tissue characteristics that aid in the screening and diagnosis of diseases. However, limited scanning time, image corruption and various imaging protocols often result in incomplete multi-modal images, thus limiting the usage of multi-modal data for clinical purposes. To address this issue, in this paper, we propose a novel unified multi-modal image synthesis method for missing modality imputation. Our method overall takes a generative adversarial architecture, which aims to synthesize missing modalities from any combination of available ones with a single model. To this end, we specifically design a Commonality- and Discrepancy-Sensitive Encoder for the generator to exploit both modality-invariant and specific information contained in input modalities. The incorporation of both types of information facilitates the generation of images with consistent anatomy and realistic details of the desired distribution. Besides, we propose a Dynamic Feature Unification Module to integrate information from a varying number of available modalities, which enables the network to be robust to random missing modalities. The module performs both hard integration and soft integration, ensuring the effectiveness of feature combination while avoiding information loss. Verified on two public multi-modal magnetic resonance datasets, the proposed method is effective in handling various synthesis tasks and shows superior performance compared to previous methods.
Collapse
|
22
|
Xu F, Mandija S, Kleinloog JPD, Liu H, van der Heide O, van der Kolk AG, Dankbaar JW, van den Berg CAT, Sbrizzi A. Improving the lesion appearance on FLAIR images synthetized from quantitative MRI: a fast, hybrid approach. MAGMA (NEW YORK, N.Y.) 2024; 37:1021-1030. [PMID: 39180686 PMCID: PMC11582199 DOI: 10.1007/s10334-024-01198-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Revised: 07/19/2024] [Accepted: 07/30/2024] [Indexed: 08/26/2024]
Abstract
OBJECTIVE The image quality of synthetized FLAIR (fluid attenuated inversion recovery) images is generally inferior to its conventional counterpart, especially regarding the lesion contrast mismatch. This work aimed to improve the lesion appearance through a hybrid methodology. MATERIALS AND METHODS We combined a full brain 5-min MR-STAT acquisition followed by FLAIR synthetization step with an ultra-under sampled conventional FLAIR sequence and performed the retrospective and prospective analysis of the proposed method on the patient datasets and a healthy volunteer. RESULTS All performance metrics of the proposed hybrid FLAIR images on patient datasets were significantly higher than those of the physics-based FLAIR images (p < 0.005), and comparable to those of conventional FLAIR images. The small difference between prospective and retrospective analysis on a healthy volunteer demonstrated the validity of the retrospective analysis of the hybrid method as presented for the patient datasets. DISCUSSION The proposed hybrid FLAIR achieved an improved lesion appearance in the clinical cases with neurological diseases compared to the physics-based FLAIR images, Future prospective work on patient data will address the validation of the method from a diagnostic perspective by radiological inspection of the new images over a larger patient cohort.
Collapse
Affiliation(s)
- Fei Xu
- Computational Imaging Group for MR Diagnostics & Therapy, Center for Image Sciences, University Medical Center Utrecht, Utrecht, The Netherlands.
- Department of Radiotherapy, University Medical Center Utrecht, Utrecht, The Netherlands.
| | - Stefano Mandija
- Computational Imaging Group for MR Diagnostics & Therapy, Center for Image Sciences, University Medical Center Utrecht, Utrecht, The Netherlands
- Department of Radiotherapy, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Jordi P D Kleinloog
- Computational Imaging Group for MR Diagnostics & Therapy, Center for Image Sciences, University Medical Center Utrecht, Utrecht, The Netherlands
- Department of Radiotherapy, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Hongyan Liu
- Computational Imaging Group for MR Diagnostics & Therapy, Center for Image Sciences, University Medical Center Utrecht, Utrecht, The Netherlands
- Department of Radiotherapy, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Oscar van der Heide
- Computational Imaging Group for MR Diagnostics & Therapy, Center for Image Sciences, University Medical Center Utrecht, Utrecht, The Netherlands
- Department of Radiotherapy, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Anja G van der Kolk
- Department of Medical Imaging, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Jan Willem Dankbaar
- Department of Radiology, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Cornelis A T van den Berg
- Computational Imaging Group for MR Diagnostics & Therapy, Center for Image Sciences, University Medical Center Utrecht, Utrecht, The Netherlands
- Department of Radiotherapy, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Alessandro Sbrizzi
- Computational Imaging Group for MR Diagnostics & Therapy, Center for Image Sciences, University Medical Center Utrecht, Utrecht, The Netherlands
- Department of Radiotherapy, University Medical Center Utrecht, Utrecht, The Netherlands
| |
Collapse
|
23
|
Peng W, Bosschieter T, Ouyang J, Paul R, Sullivan EV, Pfefferbaum A, Adeli E, Zhao Q, Pohl KM. Metadata-conditioned generative models to synthesize anatomically-plausible 3D brain MRIs. Med Image Anal 2024; 98:103325. [PMID: 39208560 DOI: 10.1016/j.media.2024.103325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 08/06/2024] [Accepted: 08/20/2024] [Indexed: 09/04/2024]
Abstract
Recent advances in generative models have paved the way for enhanced generation of natural and medical images, including synthetic brain MRIs. However, the mainstay of current AI research focuses on optimizing synthetic MRIs with respect to visual quality (such as signal-to-noise ratio) while lacking insights into their relevance to neuroscience. To generate high-quality T1-weighted MRIs relevant for neuroscience discovery, we present a two-stage Diffusion Probabilistic Model (called BrainSynth) to synthesize high-resolution MRIs conditionally-dependent on metadata (such as age and sex). We then propose a novel procedure to assess the quality of BrainSynth according to how well its synthetic MRIs capture macrostructural properties of brain regions and how accurately they encode the effects of age and sex. Results indicate that more than half of the brain regions in our synthetic MRIs are anatomically plausible, i.e., the effect size between real and synthetic MRIs is small relative to biological factors such as age and sex. Moreover, the anatomical plausibility varies across cortical regions according to their geometric complexity. As is, the MRIs generated by BrainSynth significantly improve the training of a predictive model to identify accelerated aging effects in an independent study. These results indicate that our model accurately capture the brain's anatomical information and thus could enrich the data of underrepresented samples in a study. The code of BrainSynth will be released as part of the MONAI project at https://github.com/Project-MONAI/GenerativeModels.
Collapse
Affiliation(s)
- Wei Peng
- Department of Psychiatry & Behavioral Sciences, Stanford University, Stanford, CA 94305, United States of America
| | - Tomas Bosschieter
- Institute for Computational and Mathematical Engineering, Stanford University, Stanford, CA 94305, United States of America
| | - Jiahong Ouyang
- Department of Electrical Engineering, Stanford University, Stanford, CA 94305, United States of America
| | - Robert Paul
- Missouri Institute of Mental Health, University of Missouri, St. Louis, MO 63121, United States of America
| | - Edith V Sullivan
- Department of Psychiatry & Behavioral Sciences, Stanford University, Stanford, CA 94305, United States of America
| | - Adolf Pfefferbaum
- Center for Health Sciences, SRI International, Menlo Park, CA 94025, United States of America
| | - Ehsan Adeli
- Department of Psychiatry & Behavioral Sciences, Stanford University, Stanford, CA 94305, United States of America; Department of Computer Science, Stanford University, Stanford, CA 94305, United States of America
| | - Qingyu Zhao
- Department of Radiology, Weill Cornell Medicine, New York, NY 10065, United States of America.
| | - Kilian M Pohl
- Department of Psychiatry & Behavioral Sciences, Stanford University, Stanford, CA 94305, United States of America; Department of Electrical Engineering, Stanford University, Stanford, CA 94305, United States of America.
| |
Collapse
|
24
|
Akpinar MH, Sengur A, Salvi M, Seoni S, Faust O, Mir H, Molinari F, Acharya UR. Synthetic Data Generation via Generative Adversarial Networks in Healthcare: A Systematic Review of Image- and Signal-Based Studies. IEEE OPEN JOURNAL OF ENGINEERING IN MEDICINE AND BIOLOGY 2024; 6:183-192. [PMID: 39698120 PMCID: PMC11655107 DOI: 10.1109/ojemb.2024.3508472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2024] [Revised: 11/13/2024] [Accepted: 11/26/2024] [Indexed: 12/20/2024] Open
Abstract
Generative Adversarial Networks (GANs) have emerged as a powerful tool in artificial intelligence, particularly for unsupervised learning. This systematic review analyzes GAN applications in healthcare, focusing on image and signal-based studies across various clinical domains. Following Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) guidelines, we reviewed 72 relevant journal articles. Our findings reveal that magnetic resonance imaging (MRI) and electrocardiogram (ECG) signal acquisition techniques were most utilized, with brain studies (22%), cardiology (18%), cancer (15%), ophthalmology (12%), and lung studies (10%) being the most researched areas. We discuss key GAN architectures, including cGAN (31%) and CycleGAN (18%), along with datasets, evaluation metrics, and performance outcomes. The review highlights promising data augmentation, anonymization, and multi-task learning results. We identify current limitations, such as the lack of standardized metrics and direct comparisons, and propose future directions, including the development of no-reference metrics, immersive simulation scenarios, and enhanced interpretability.
Collapse
Affiliation(s)
- Muhammed Halil Akpinar
- Vocational School of Technical SciencesIstanbul University-Cerrahpasa34320IstanbulTürkiye
| | | | - Massimo Salvi
- Department of Electronics and TelecommunicationsPolitecnico di Torino10129TurinItaly
| | - Silvia Seoni
- Department of Electronics and TelecommunicationsPolitecnico di Torino10129TurinItaly
| | - Oliver Faust
- Anglia Ruskin University Cambridge CampusCB1 1PTCambridgeU.K.
| | - Hasan Mir
- American University of SharjahSharjah26666UAE
| | - Filippo Molinari
- Department of Electronics and TelecommunicationsPolitecnico di Torino10129TurinItaly
| | | |
Collapse
|
25
|
Ahmadzade M, Moron FE, Shastri R, Lincoln C, Rad MG. AI-Assisted Post Contrast Brain MRI: Eighty Percent Reduction in Contrast Dose. Acad Radiol 2024:S1076-6332(24)00787-6. [PMID: 39592383 DOI: 10.1016/j.acra.2024.10.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2024] [Revised: 10/10/2024] [Accepted: 10/18/2024] [Indexed: 11/28/2024]
Abstract
OBJECTIVES In the context of growing safety concerns regarding the use of gadolinium-based contrast agents in contrast-enhanced MRI, there is a need for dose reduction without compromising diagnostic accuracy. A deep learning (DL) method is proposed and evaluated in this study for predicting full-dose contrast-enhanced T1w images from multiparametric MRI acquired with 20% of the standard dose of gadolinium-based contrast agents. MATERIALS AND METHODS This multicentric prospective study leveraged multiparametric brain MRIs acquired between March and July 2024. A total of 101 patients were included. Patients with white matter disease, small vessels disease, tumor or mass, post-operative change and no enhanced lesion were included. Pre-contrast, low-dose, and standard-dose postcontrast T1w sequences were acquired. A DL network was utilized to process pre-contrast and low-dose sequences to generate synthesized full-dose contrast-enhanced T1w images. DL-T1w images and full-dose T1w MRI images were qualitatively and quantitatively compared using both automated voxel-wise metrics and a reader study, in which three neuroradiologists graded the image quality, image SNR, vessel conspicuity and lesion visualization using a 5-point Likert scale. RESULTS A comparison of the average reader scores for DL-T1w images and full-dose-T1w images did not show any significant differences in image quality (P = 0.08); however, the image SNR and vessel conspicuity scores were higher for DL-T1w images (P < 0.05). In all 3 reader evaluations, the lower limit of the 95% CI for differences in least square means for border delineation, internal morphology, and contrast enhancement was above the noninferiority margin, showing statistical noninferiority between DL-T1w and full-dose-T1w paired images (≥ -0.26) (P < 0.001). The DL-T1w images obtained an SSIM of 86 ± 12.1% relative to the full-dose-T1w images, and a PSNR of 27 ± 3 dB. CONCLUSION The proposed DL method was capable of generating synthesized postcontrast T1-weighted MR images that were comparable to full-dose T1w images, as determined by quantitative analysis and radiologist evaluation.
Collapse
Affiliation(s)
- Mohadese Ahmadzade
- Department of Radiology, Section of Vascular and Interventional Radiology, Baylor College of Medicine, Houston, TX (M.A., M.G.R.)
| | - Fanny Emilia Moron
- Department of Radiology, Section of Neuroradiology, Baylor College of Medicine, Houston, TX (F.E.M.)
| | - Ravi Shastri
- Department of radiology, Section of Interventional Neuroradiology, Baylor College of Medicine, Houston, TX (R.S.)
| | - Christie Lincoln
- Department of Radiology, Section of Neuroradiology, MD Anderson Cancer center, UT McGovern, Houston, TX (C.L.)
| | - Mohammad Ghasemi Rad
- Department of Radiology, Section of Vascular and Interventional Radiology, Baylor College of Medicine, Houston, TX (M.A., M.G.R.).
| |
Collapse
|
26
|
Zhang R, Du X, Li H. Application and performance enhancement of FAIMS spectral data for deep learning analysis using generative adversarial network reinforcement. Anal Biochem 2024; 694:115627. [PMID: 39033946 DOI: 10.1016/j.ab.2024.115627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Revised: 06/21/2024] [Accepted: 07/18/2024] [Indexed: 07/23/2024]
Abstract
When using High-field asymmetric ion mobility spectrometry (FAIMS) to process complex mixtures for deep learning analysis, there is a problem of poor recognition performance due to the lack of high-quality data and low sample diversity. In this paper, a Generative Adversarial Network (GAN) method is introduced to simulate and generate highly realistic and diverse spectral for expanding the dataset using real mixture spectral data of 15 classes collected by FAIMS. The mixed datasets were put into VGG and ResNeXt for testing respectively, and the experimental results proved that the best recognition effect was achieved when the ratio of real data to generated data was 1:4: where accuracy improved by 24.19 % and 6.43 %; precision improved by 23.71 % and 6.97 %; recall improved by 21.08 % and 7.09 %; and F1-score improved by 24.50 % and 8.23 %. The above results strongly demonstrate that GAN can effectively expand the data volume and increase the sample diversity without increasing the additional experimental cost, which significantly enhances the experimental effect of FAIMS spectral for the analysis of complex mixtures.
Collapse
Affiliation(s)
- Ruilong Zhang
- School of Life and Environmental Sciences, GuiLin University of Electronic Technology, GuiLin, 541004, China
| | - Xiaoxia Du
- School of Life and Environmental Sciences, GuiLin University of Electronic Technology, GuiLin, 541004, China.
| | - Hua Li
- School of Life and Environmental Sciences, GuiLin University of Electronic Technology, GuiLin, 541004, China.
| |
Collapse
|
27
|
Xu C, Li J, Wang Y, Wang L, Wang Y, Zhang X, Liu W, Chen J, Vatian A, Gusarova N, Ye C, Zheng Z. SiMix: A domain generalization method for cross-site brain MRI harmonization via site mixing. Neuroimage 2024; 299:120812. [PMID: 39197559 DOI: 10.1016/j.neuroimage.2024.120812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 08/20/2024] [Accepted: 08/22/2024] [Indexed: 09/01/2024] Open
Abstract
Brain magnetic resonance imaging (MRI) is widely used in clinical practice for disease diagnosis. However, MRI scans acquired at different sites can have different appearances due to the difference in the hardware, pulse sequence, and imaging parameter. It is important to reduce or eliminate such cross-site variations with brain MRI harmonization so that downstream image processing and analysis is performed consistently. Previous works on the harmonization problem require the data acquired from the sites of interest for model training. But in real-world scenarios there can be test data from a new site of interest after the model is trained, and training data from the new site is unavailable when the model is trained. In this case, previous methods cannot optimally handle the test data from the new unseen site. To address the problem, in this work we explore domain generalization for brain MRI harmonization and propose Site Mix (SiMix). We assume that images of travelling subjects are acquired at a few existing sites for model training. To allow the training data to better represent the test data from unseen sites, we first propose to mix the training images belonging to different sites stochastically, which substantially increases the diversity of the training data while preserving the authenticity of the mixed training images. Second, at test time, when a test image from an unseen site is given, we propose a multiview strategy that perturbs the test image with preserved authenticity and ensembles the harmonization results of the perturbed images for improved harmonization quality. To validate SiMix, we performed experiments on the publicly available SRPBS dataset and MUSHAC dataset that comprised brain MRI acquired at nine and two different sites, respectively. The results indicate that SiMix improves brain MRI harmonization for unseen sites, and it is also beneficial to the harmonization of existing sites.
Collapse
Affiliation(s)
- Chundan Xu
- School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing, China
| | - Jie Li
- Department of Radiology, Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
| | - Yakui Wang
- Department of Radiology, Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
| | - Lixue Wang
- Department of Radiology, Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
| | - Yizhe Wang
- Department of Radiology, Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
| | - Xiaofeng Zhang
- School of Information and Electronics, Beijing Institute of Technology, Zhuhai, China
| | - Weiqi Liu
- Sophmind Technology (Beijing) Co., Ltd., Beijing, China
| | - Jingang Chen
- Sophmind Technology (Beijing) Co., Ltd., Beijing, China
| | - Aleksandra Vatian
- Faculty of Infocommunicational Technologies, ITMO University, St. Petersburg, Russia
| | - Natalia Gusarova
- Faculty of Infocommunicational Technologies, ITMO University, St. Petersburg, Russia
| | - Chuyang Ye
- School of Integrated Circuits and Electronics, Beijing Institute of Technology, Beijing, China.
| | - Zhuozhao Zheng
- Department of Radiology, Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China.
| |
Collapse
|
28
|
Koetzier LR, Wu J, Mastrodicasa D, Lutz A, Chung M, Koszek WA, Pratap J, Chaudhari AS, Rajpurkar P, Lungren MP, Willemink MJ. Generating Synthetic Data for Medical Imaging. Radiology 2024; 312:e232471. [PMID: 39254456 PMCID: PMC11444329 DOI: 10.1148/radiol.232471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 02/15/2024] [Accepted: 03/01/2024] [Indexed: 09/11/2024]
Abstract
Artificial intelligence (AI) models for medical imaging tasks, such as classification or segmentation, require large and diverse datasets of images. However, due to privacy and ethical issues, as well as data sharing infrastructure barriers, these datasets are scarce and difficult to assemble. Synthetic medical imaging data generated by AI from existing data could address this challenge by augmenting and anonymizing real imaging data. In addition, synthetic data enable new applications, including modality translation, contrast synthesis, and professional training for radiologists. However, the use of synthetic data also poses technical and ethical challenges. These challenges include ensuring the realism and diversity of the synthesized images while keeping data unidentifiable, evaluating the performance and generalizability of models trained on synthetic data, and high computational costs. Since existing regulations are not sufficient to guarantee the safe and ethical use of synthetic images, it becomes evident that updated laws and more rigorous oversight are needed. Regulatory bodies, physicians, and AI developers should collaborate to develop, maintain, and continually refine best practices for synthetic data. This review aims to provide an overview of the current knowledge of synthetic data in medical imaging and highlights current key challenges in the field to guide future research and development.
Collapse
Affiliation(s)
- Lennart R. Koetzier
- From the Delft University of Technology, Delft, the Netherlands (L.R.K.); Segmed, 3790 El Camino Real #810, Palo Alto, CA 94306 (J.W., A.L., M.C., W.A.K., J.P., M.J.W.); Department of Radiology, University of Washington, Seattle, Wash (D.M.); Department of Radiology, OncoRad/Tumor Imaging Metrics Core, Seattle, Wash (D.M.); Harvard University, Cambridge, Mass (J.P.); Department of Radiology, Stanford University School of Medicine, Palo Alto, Calif (A.S.C.); Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (A.S.C.); Department of Biomedical Informatics, Harvard Medical School, Boston, Mass (P.R.); Microsoft, Redmond, Wash (M.P.L.); and Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, Calif (M.P.L.)
| | - Jie Wu
- From the Delft University of Technology, Delft, the Netherlands (L.R.K.); Segmed, 3790 El Camino Real #810, Palo Alto, CA 94306 (J.W., A.L., M.C., W.A.K., J.P., M.J.W.); Department of Radiology, University of Washington, Seattle, Wash (D.M.); Department of Radiology, OncoRad/Tumor Imaging Metrics Core, Seattle, Wash (D.M.); Harvard University, Cambridge, Mass (J.P.); Department of Radiology, Stanford University School of Medicine, Palo Alto, Calif (A.S.C.); Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (A.S.C.); Department of Biomedical Informatics, Harvard Medical School, Boston, Mass (P.R.); Microsoft, Redmond, Wash (M.P.L.); and Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, Calif (M.P.L.)
| | - Domenico Mastrodicasa
- From the Delft University of Technology, Delft, the Netherlands (L.R.K.); Segmed, 3790 El Camino Real #810, Palo Alto, CA 94306 (J.W., A.L., M.C., W.A.K., J.P., M.J.W.); Department of Radiology, University of Washington, Seattle, Wash (D.M.); Department of Radiology, OncoRad/Tumor Imaging Metrics Core, Seattle, Wash (D.M.); Harvard University, Cambridge, Mass (J.P.); Department of Radiology, Stanford University School of Medicine, Palo Alto, Calif (A.S.C.); Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (A.S.C.); Department of Biomedical Informatics, Harvard Medical School, Boston, Mass (P.R.); Microsoft, Redmond, Wash (M.P.L.); and Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, Calif (M.P.L.)
| | - Aline Lutz
- From the Delft University of Technology, Delft, the Netherlands (L.R.K.); Segmed, 3790 El Camino Real #810, Palo Alto, CA 94306 (J.W., A.L., M.C., W.A.K., J.P., M.J.W.); Department of Radiology, University of Washington, Seattle, Wash (D.M.); Department of Radiology, OncoRad/Tumor Imaging Metrics Core, Seattle, Wash (D.M.); Harvard University, Cambridge, Mass (J.P.); Department of Radiology, Stanford University School of Medicine, Palo Alto, Calif (A.S.C.); Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (A.S.C.); Department of Biomedical Informatics, Harvard Medical School, Boston, Mass (P.R.); Microsoft, Redmond, Wash (M.P.L.); and Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, Calif (M.P.L.)
| | - Matthew Chung
- From the Delft University of Technology, Delft, the Netherlands (L.R.K.); Segmed, 3790 El Camino Real #810, Palo Alto, CA 94306 (J.W., A.L., M.C., W.A.K., J.P., M.J.W.); Department of Radiology, University of Washington, Seattle, Wash (D.M.); Department of Radiology, OncoRad/Tumor Imaging Metrics Core, Seattle, Wash (D.M.); Harvard University, Cambridge, Mass (J.P.); Department of Radiology, Stanford University School of Medicine, Palo Alto, Calif (A.S.C.); Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (A.S.C.); Department of Biomedical Informatics, Harvard Medical School, Boston, Mass (P.R.); Microsoft, Redmond, Wash (M.P.L.); and Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, Calif (M.P.L.)
| | - W. Adam Koszek
- From the Delft University of Technology, Delft, the Netherlands (L.R.K.); Segmed, 3790 El Camino Real #810, Palo Alto, CA 94306 (J.W., A.L., M.C., W.A.K., J.P., M.J.W.); Department of Radiology, University of Washington, Seattle, Wash (D.M.); Department of Radiology, OncoRad/Tumor Imaging Metrics Core, Seattle, Wash (D.M.); Harvard University, Cambridge, Mass (J.P.); Department of Radiology, Stanford University School of Medicine, Palo Alto, Calif (A.S.C.); Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (A.S.C.); Department of Biomedical Informatics, Harvard Medical School, Boston, Mass (P.R.); Microsoft, Redmond, Wash (M.P.L.); and Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, Calif (M.P.L.)
| | - Jayanth Pratap
- From the Delft University of Technology, Delft, the Netherlands (L.R.K.); Segmed, 3790 El Camino Real #810, Palo Alto, CA 94306 (J.W., A.L., M.C., W.A.K., J.P., M.J.W.); Department of Radiology, University of Washington, Seattle, Wash (D.M.); Department of Radiology, OncoRad/Tumor Imaging Metrics Core, Seattle, Wash (D.M.); Harvard University, Cambridge, Mass (J.P.); Department of Radiology, Stanford University School of Medicine, Palo Alto, Calif (A.S.C.); Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (A.S.C.); Department of Biomedical Informatics, Harvard Medical School, Boston, Mass (P.R.); Microsoft, Redmond, Wash (M.P.L.); and Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, Calif (M.P.L.)
| | - Akshay S. Chaudhari
- From the Delft University of Technology, Delft, the Netherlands (L.R.K.); Segmed, 3790 El Camino Real #810, Palo Alto, CA 94306 (J.W., A.L., M.C., W.A.K., J.P., M.J.W.); Department of Radiology, University of Washington, Seattle, Wash (D.M.); Department of Radiology, OncoRad/Tumor Imaging Metrics Core, Seattle, Wash (D.M.); Harvard University, Cambridge, Mass (J.P.); Department of Radiology, Stanford University School of Medicine, Palo Alto, Calif (A.S.C.); Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (A.S.C.); Department of Biomedical Informatics, Harvard Medical School, Boston, Mass (P.R.); Microsoft, Redmond, Wash (M.P.L.); and Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, Calif (M.P.L.)
| | - Pranav Rajpurkar
- From the Delft University of Technology, Delft, the Netherlands (L.R.K.); Segmed, 3790 El Camino Real #810, Palo Alto, CA 94306 (J.W., A.L., M.C., W.A.K., J.P., M.J.W.); Department of Radiology, University of Washington, Seattle, Wash (D.M.); Department of Radiology, OncoRad/Tumor Imaging Metrics Core, Seattle, Wash (D.M.); Harvard University, Cambridge, Mass (J.P.); Department of Radiology, Stanford University School of Medicine, Palo Alto, Calif (A.S.C.); Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (A.S.C.); Department of Biomedical Informatics, Harvard Medical School, Boston, Mass (P.R.); Microsoft, Redmond, Wash (M.P.L.); and Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, Calif (M.P.L.)
| | - Matthew P. Lungren
- From the Delft University of Technology, Delft, the Netherlands (L.R.K.); Segmed, 3790 El Camino Real #810, Palo Alto, CA 94306 (J.W., A.L., M.C., W.A.K., J.P., M.J.W.); Department of Radiology, University of Washington, Seattle, Wash (D.M.); Department of Radiology, OncoRad/Tumor Imaging Metrics Core, Seattle, Wash (D.M.); Harvard University, Cambridge, Mass (J.P.); Department of Radiology, Stanford University School of Medicine, Palo Alto, Calif (A.S.C.); Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (A.S.C.); Department of Biomedical Informatics, Harvard Medical School, Boston, Mass (P.R.); Microsoft, Redmond, Wash (M.P.L.); and Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, Calif (M.P.L.)
| | - Martin J. Willemink
- From the Delft University of Technology, Delft, the Netherlands (L.R.K.); Segmed, 3790 El Camino Real #810, Palo Alto, CA 94306 (J.W., A.L., M.C., W.A.K., J.P., M.J.W.); Department of Radiology, University of Washington, Seattle, Wash (D.M.); Department of Radiology, OncoRad/Tumor Imaging Metrics Core, Seattle, Wash (D.M.); Harvard University, Cambridge, Mass (J.P.); Department of Radiology, Stanford University School of Medicine, Palo Alto, Calif (A.S.C.); Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, Calif (A.S.C.); Department of Biomedical Informatics, Harvard Medical School, Boston, Mass (P.R.); Microsoft, Redmond, Wash (M.P.L.); and Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, Calif (M.P.L.)
| |
Collapse
|
29
|
Bi Y, Abrol A, Jia S, Sui J, Calhoun VD. Gray matters: ViT-GAN framework for identifying schizophrenia biomarkers linking structural MRI and functional network connectivity. Neuroimage 2024; 297:120674. [PMID: 38851549 DOI: 10.1016/j.neuroimage.2024.120674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 06/02/2024] [Accepted: 06/06/2024] [Indexed: 06/10/2024] Open
Abstract
Brain disorders are often associated with changes in brain structure and function, where functional changes may be due to underlying structural variations. Gray matter (GM) volume segmentation from 3D structural MRI offers vital structural information for brain disorders like schizophrenia, as it encompasses essential brain tissues such as neuronal cell bodies, dendrites, and synapses, which are crucial for neural signal processing and transmission; changes in GM volume can thus indicate alterations in these tissues, reflecting underlying pathological conditions. In addition, the use of the ICA algorithm to transform high-dimensional fMRI data into functional network connectivity (FNC) matrices serves as an effective carrier of functional information. In our study, we introduce a new generative deep learning architecture, the conditional efficient vision transformer generative adversarial network (cEViT-GAN), which adeptly generates FNC matrices conditioned on GM to facilitate the exploration of potential connections between brain structure and function. We developed a new, lightweight self-attention mechanism for our ViT-based generator, enhancing the generation of refined attention maps critical for identifying structural biomarkers based on GM. Our approach not only generates high quality FNC matrices with a Pearson correlation of 0.74 compared to real FNC data, but also uses attention map technology to identify potential biomarkers in GM structure that could lead to functional abnormalities in schizophrenia patients. Visualization experiments within our study have highlighted these structural biomarkers, including the medial prefrontal cortex (mPFC), dorsolateral prefrontal cortex (DL-PFC), and cerebellum. In addition, through cross-domain analysis comparing generated and real FNC matrices, we have identified functional connections with the highest correlations to structural information, further validating the structure-function connections. This comprehensive analysis helps to understand the intricate relationship between brain structure and its functional manifestations, providing a more refined insight into the neurobiological research of schizophrenia.
Collapse
Affiliation(s)
- Yuda Bi
- Tri-Institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS), Georgia State, Georgia Tech, Emory, Atlanta, GA 30303, USA.
| | - Anees Abrol
- Tri-Institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS), Georgia State, Georgia Tech, Emory, Atlanta, GA 30303, USA
| | - Sihan Jia
- Tri-Institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS), Georgia State, Georgia Tech, Emory, Atlanta, GA 30303, USA
| | - Jing Sui
- Tri-Institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS), Georgia State, Georgia Tech, Emory, Atlanta, GA 30303, USA
| | - Vince D Calhoun
- Tri-Institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS), Georgia State, Georgia Tech, Emory, Atlanta, GA 30303, USA
| |
Collapse
|
30
|
Liu Z, Kainth K, Zhou A, Deyer TW, Fayad ZA, Greenspan H, Mei X. A review of self-supervised, generative, and few-shot deep learning methods for data-limited magnetic resonance imaging segmentation. NMR IN BIOMEDICINE 2024; 37:e5143. [PMID: 38523402 DOI: 10.1002/nbm.5143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 02/15/2024] [Accepted: 02/16/2024] [Indexed: 03/26/2024]
Abstract
Magnetic resonance imaging (MRI) is a ubiquitous medical imaging technology with applications in disease diagnostics, intervention, and treatment planning. Accurate MRI segmentation is critical for diagnosing abnormalities, monitoring diseases, and deciding on a course of treatment. With the advent of advanced deep learning frameworks, fully automated and accurate MRI segmentation is advancing. Traditional supervised deep learning techniques have advanced tremendously, reaching clinical-level accuracy in the field of segmentation. However, these algorithms still require a large amount of annotated data, which is oftentimes unavailable or impractical. One way to circumvent this issue is to utilize algorithms that exploit a limited amount of labeled data. This paper aims to review such state-of-the-art algorithms that use a limited number of annotated samples. We explain the fundamental principles of self-supervised learning, generative models, few-shot learning, and semi-supervised learning and summarize their applications in cardiac, abdomen, and brain MRI segmentation. Throughout this review, we highlight algorithms that can be employed based on the quantity of annotated data available. We also present a comprehensive list of notable publicly available MRI segmentation datasets. To conclude, we discuss possible future directions of the field-including emerging algorithms, such as contrastive language-image pretraining, and potential combinations across the methods discussed-that can further increase the efficacy of image segmentation with limited labels.
Collapse
Affiliation(s)
- Zelong Liu
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Komal Kainth
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Alexander Zhou
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Timothy W Deyer
- East River Medical Imaging, New York, New York, USA
- Department of Radiology, Cornell Medicine, New York, New York, USA
| | - Zahi A Fayad
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Diagnostic, Molecular, and Interventional Radiology, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Hayit Greenspan
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Diagnostic, Molecular, and Interventional Radiology, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Xueyan Mei
- BioMedical Engineering and Imaging Institute, Icahn School of Medicine at Mount Sinai, New York, New York, USA
- Department of Diagnostic, Molecular, and Interventional Radiology, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| |
Collapse
|
31
|
Guo X, Shi L, Chen X, Liu Q, Zhou B, Xie H, Liu YH, Palyo R, Miller EJ, Sinusas AJ, Staib L, Spottiswoode B, Liu C, Dvornek NC. TAI-GAN: A Temporally and Anatomically Informed Generative Adversarial Network for early-to-late frame conversion in dynamic cardiac PET inter-frame motion correction. Med Image Anal 2024; 96:103190. [PMID: 38820677 PMCID: PMC11180595 DOI: 10.1016/j.media.2024.103190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 04/12/2024] [Accepted: 05/01/2024] [Indexed: 06/02/2024]
Abstract
Inter-frame motion in dynamic cardiac positron emission tomography (PET) using rubidium-82 (82Rb) myocardial perfusion imaging impacts myocardial blood flow (MBF) quantification and the diagnosis accuracy of coronary artery diseases. However, the high cross-frame distribution variation due to rapid tracer kinetics poses a considerable challenge for inter-frame motion correction, especially for early frames where intensity-based image registration techniques often fail. To address this issue, we propose a novel method called Temporally and Anatomically Informed Generative Adversarial Network (TAI-GAN) that utilizes an all-to-one mapping to convert early frames into those with tracer distribution similar to the last reference frame. The TAI-GAN consists of a feature-wise linear modulation layer that encodes channel-wise parameters generated from temporal information and rough cardiac segmentation masks with local shifts that serve as anatomical information. Our proposed method was evaluated on a clinical 82Rb PET dataset, and the results show that our TAI-GAN can produce converted early frames with high image quality, comparable to the real reference frames. After TAI-GAN conversion, the motion estimation accuracy and subsequent myocardial blood flow (MBF) quantification with both conventional and deep learning-based motion correction methods were improved compared to using the original frames. The code is available at https://github.com/gxq1998/TAI-GAN.
Collapse
Affiliation(s)
- Xueqi Guo
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA.
| | | | - Xiongchao Chen
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA
| | - Qiong Liu
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA
| | - Bo Zhou
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA
| | - Huidong Xie
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA
| | - Yi-Hwa Liu
- Department of Internal Medicine, Yale University, New Haven, CT, USA
| | | | - Edward J Miller
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA; Department of Internal Medicine, Yale University, New Haven, CT, USA; Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, USA
| | - Albert J Sinusas
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA; Department of Internal Medicine, Yale University, New Haven, CT, USA; Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, USA
| | - Lawrence Staib
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA; Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, USA
| | | | - Chi Liu
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA; Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, USA.
| | - Nicha C Dvornek
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA; Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, USA.
| |
Collapse
|
32
|
Sinha A, Kawahara J, Pakzad A, Abhishek K, Ruthven M, Ghorbel E, Kacem A, Aouada D, Hamarneh G. DermSynth3D: Synthesis of in-the-wild annotated dermatology images. Med Image Anal 2024; 95:103145. [PMID: 38615432 DOI: 10.1016/j.media.2024.103145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 02/11/2024] [Accepted: 03/18/2024] [Indexed: 04/16/2024]
Abstract
In recent years, deep learning (DL) has shown great potential in the field of dermatological image analysis. However, existing datasets in this domain have significant limitations, including a small number of image samples, limited disease conditions, insufficient annotations, and non-standardized image acquisitions. To address these shortcomings, we propose a novel framework called DermSynth3D. DermSynth3D blends skin disease patterns onto 3D textured meshes of human subjects using a differentiable renderer and generates 2D images from various camera viewpoints under chosen lighting conditions in diverse background scenes. Our method adheres to top-down rules that constrain the blending and rendering process to create 2D images with skin conditions that mimic in-the-wild acquisitions, ensuring more meaningful results. The framework generates photo-realistic 2D dermatological images and the corresponding dense annotations for semantic segmentation of the skin, skin conditions, body parts, bounding boxes around lesions, depth maps, and other 3D scene parameters, such as camera position and lighting conditions. DermSynth3D allows for the creation of custom datasets for various dermatology tasks. We demonstrate the effectiveness of data generated using DermSynth3D by training DL models on synthetic data and evaluating them on various dermatology tasks using real 2D dermatological images. We make our code publicly available at https://github.com/sfu-mial/DermSynth3D.
Collapse
Affiliation(s)
- Ashish Sinha
- Medical Image Analysis Lab, School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada
| | - Jeremy Kawahara
- Medical Image Analysis Lab, School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada
| | - Arezou Pakzad
- Medical Image Analysis Lab, School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada
| | - Kumar Abhishek
- Medical Image Analysis Lab, School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada
| | - Matthieu Ruthven
- Computer Vision, Imaging & Machine Intelligence Research Group, Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, L-1855, Luxembourg
| | - Enjie Ghorbel
- Computer Vision, Imaging & Machine Intelligence Research Group, Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, L-1855, Luxembourg; Cristal Laboratory, National School of Computer Sciences, University of Manouba, 2010, Tunisia
| | - Anis Kacem
- Computer Vision, Imaging & Machine Intelligence Research Group, Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, L-1855, Luxembourg
| | - Djamila Aouada
- Computer Vision, Imaging & Machine Intelligence Research Group, Interdisciplinary Centre for Security, Reliability and Trust (SnT), University of Luxembourg, L-1855, Luxembourg
| | - Ghassan Hamarneh
- Medical Image Analysis Lab, School of Computing Science, Simon Fraser University, Burnaby V5A 1S6, Canada.
| |
Collapse
|
33
|
Gundogdu B, Medved M, Chatterjee A, Engelmann R, Rosado A, Lee G, Oren NC, Oto A, Karczmar GS. Self-supervised multicontrast super-resolution for diffusion-weighted prostate MRI. Magn Reson Med 2024; 92:319-331. [PMID: 38308149 PMCID: PMC11288973 DOI: 10.1002/mrm.30047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 01/19/2024] [Accepted: 01/23/2024] [Indexed: 02/04/2024]
Abstract
PURPOSE This study addresses the challenge of low resolution and signal-to-noise ratio (SNR) in diffusion-weighted images (DWI), which are pivotal for cancer detection. Traditional methods increase SNR at high b-values through multiple acquisitions, but this results in diminished image resolution due to motion-induced variations. Our research aims to enhance spatial resolution by exploiting the global structure within multicontrast DWI scans and millimetric motion between acquisitions. METHODS We introduce a novel approach employing a "Perturbation Network" to learn subvoxel-size motions between scans, trained jointly with an implicit neural representation (INR) network. INR encodes the DWI as a continuous volumetric function, treating voxel intensities of low-resolution acquisitions as discrete samples. By evaluating this function with a finer grid, our model predicts higher-resolution signal intensities for intermediate voxel locations. The Perturbation Network's motion-correction efficacy was validated through experiments on biological phantoms and in vivo prostate scans. RESULTS Quantitative analyses revealed significantly higher structural similarity measures of super-resolution images to ground truth high-resolution images compared to high-order interpolation (p< $$ < $$ 0.005). In blind qualitative experiments,96 . 1 % $$ 96.1\% $$ of super-resolution images were assessed to have superior diagnostic quality compared to interpolated images. CONCLUSION High-resolution details in DWI can be obtained without the need for high-resolution training data. One notable advantage of the proposed method is that it does not require a super-resolution training set. This is important in clinical practice because the proposed method can easily be adapted to images with different scanner settings or body parts, whereas the supervised methods do not offer such an option.
Collapse
Affiliation(s)
- Batuhan Gundogdu
- Department of Radiology, University of Chicago, Chicago, Illinois, USA
| | - Milica Medved
- Department of Radiology, University of Chicago, Chicago, Illinois, USA
| | | | - Roger Engelmann
- Department of Radiology, University of Chicago, Chicago, Illinois, USA
| | - Avery Rosado
- Department of Radiology, University of Chicago, Chicago, Illinois, USA
| | - Grace Lee
- Department of Radiology, University of Chicago, Chicago, Illinois, USA
| | - Nisa C Oren
- Department of Radiology, University of Chicago, Chicago, Illinois, USA
| | - Aytekin Oto
- Department of Radiology, University of Chicago, Chicago, Illinois, USA
| | | |
Collapse
|
34
|
Hassanzadeh R, Abrol A, Hassanzadeh HR, Calhoun VD. Cross-Modality Translation with Generative Adversarial Networks to Unveil Alzheimer's Disease Biomarkers. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2024; 2024:1-4. [PMID: 40039975 DOI: 10.1109/embc53108.2024.10781737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
Generative approaches for cross-modality transformation have recently gained significant attention in neuroimaging. While most previous work has focused on case-control data, the application of generative models to disorder-specific datasets and their ability to preserve diagnostic patterns remain relatively unexplored. Hence, in this study, we investigated the use of a generative adversarial network (GAN) in the context of Alzheimer's disease (AD) to generate functional network connectivity (FNC) and T1-weighted structural magnetic resonance imaging data from each other. We employed a cycle-GAN to synthesize data in an unpaired data transition and enhanced the transition by integrating weak supervision in cases where paired data were available. Our findings revealed that our model could offer remarkable capability, achieving a structural similarity index measure (SSIM) of 0.89 ± 0.003 for T1s and a correlation of 0.71 ± 0.004 for FNCs. Moreover, our qualitative analysis revealed similar patterns between generated and actual data when comparing AD to cognitively normal (CN) individuals. In particular, we observed significantly increased functional connectivity in cerebellar-sensory motor and cerebellar-visual networks and reduced connectivity in cerebellar-subcortical, auditory-sensory motor, sensory motor-visual, and cerebellar-cognitive control networks. Additionally, the T1 images generated by our model showed a similar pattern of atrophy in the hippocampal and other temporal regions of Alzheimer's patients.
Collapse
|
35
|
Chaudhary MFA, Gerard SE, Christensen GE, Cooper CB, Schroeder JD, Hoffman EA, Reinhardt JM. LungViT: Ensembling Cascade of Texture Sensitive Hierarchical Vision Transformers for Cross-Volume Chest CT Image-to-Image Translation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2448-2465. [PMID: 38373126 PMCID: PMC11227912 DOI: 10.1109/tmi.2024.3367321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]
Abstract
Chest computed tomography (CT) at inspiration is often complemented by an expiratory CT to identify peripheral airways disease. Additionally, co-registered inspiratory-expiratory volumes can be used to derive various markers of lung function. Expiratory CT scans, however, may not be acquired due to dose or scan time considerations or may be inadequate due to motion or insufficient exhale; leading to a missed opportunity to evaluate underlying small airways disease. Here, we propose LungViT- a generative adversarial learning approach using hierarchical vision transformers for translating inspiratory CT intensities to corresponding expiratory CT intensities. LungViT addresses several limitations of the traditional generative models including slicewise discontinuities, limited size of generated volumes, and their inability to model texture transfer at volumetric level. We propose a shifted-window hierarchical vision transformer architecture with squeeze-and-excitation decoder blocks for modeling dependencies between features. We also propose a multiview texture similarity distance metric for texture and style transfer in 3D. To incorporate global information into the training process and refine the output of our model, we use ensemble cascading. LungViT is able to generate large 3D volumes of size 320×320×320 . We train and validate our model using a diverse cohort of 1500 subjects with varying disease severity. To assess model generalizability beyond the development set biases, we evaluate our model on an out-of-distribution external validation set of 200 subjects. Clinical validation on internal and external testing sets shows that synthetic volumes could be reliably adopted for deriving clinical endpoints of chronic obstructive pulmonary disease.
Collapse
|
36
|
Luo Y, Yang Q, Liu Z, Shi Z, Huang W, Zheng G, Cheng J. Target-Guided Diffusion Models for Unpaired Cross-Modality Medical Image Translation. IEEE J Biomed Health Inform 2024; 28:4062-4071. [PMID: 38662561 DOI: 10.1109/jbhi.2024.3393870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/03/2024]
Abstract
In a clinical setting, the acquisition of certain medical image modality is often unavailable due to various considerations such as cost, radiation, etc. Therefore, unpaired cross-modality translation techniques, which involve training on the unpaired data and synthesizing the target modality with the guidance of the acquired source modality, are of great interest. Previous methods for synthesizing target medical images are to establish one-shot mapping through generative adversarial networks (GANs). As promising alternatives to GANs, diffusion models have recently received wide interests in generative tasks. In this paper, we propose a target-guided diffusion model (TGDM) for unpaired cross-modality medical image translation. For training, to encourage our diffusion model to learn more visual concepts, we adopted a perception prioritized weight scheme (P2W) to the training objectives. For sampling, a pre-trained classifier is adopted in the reverse process to relieve modality-specific remnants from source data. Experiments on both brain MRI-CT and prostate MRI-US datasets demonstrate that the proposed method achieves a visually realistic result that mimics a vivid anatomical section of the target organ. In addition, we have also conducted a subjective assessment based on the synthesized samples to further validate the clinical value of TGDM.
Collapse
|
37
|
Meng X, Sun K, Xu J, He X, Shen D. Multi-Modal Modality-Masked Diffusion Network for Brain MRI Synthesis With Random Modality Missing. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2587-2598. [PMID: 38393846 DOI: 10.1109/tmi.2024.3368664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/25/2024]
Abstract
Synthesis of unavailable imaging modalities from available ones can generate modality-specific complementary information and enable multi-modality based medical images diagnosis or treatment. Existing generative methods for medical image synthesis are usually based on cross-modal translation between acquired and missing modalities. These methods are usually dedicated to specific missing modality and perform synthesis in one shot, which cannot deal with varying number of missing modalities flexibly and construct the mapping across modalities effectively. To address the above issues, in this paper, we propose a unified Multi-modal Modality-masked Diffusion Network (M2DN), tackling multi-modal synthesis from the perspective of "progressive whole-modality inpainting", instead of "cross-modal translation". Specifically, our M2DN considers the missing modalities as random noise and takes all the modalities as a unity in each reverse diffusion step. The proposed joint synthesis scheme performs synthesis for the missing modalities and self-reconstruction for the available ones, which not only enables synthesis for arbitrary missing scenarios, but also facilitates the construction of common latent space and enhances the model representation ability. Besides, we introduce a modality-mask scheme to encode availability status of each incoming modality explicitly in a binary mask, which is adopted as condition for the diffusion model to further enhance the synthesis performance of our M2DN for arbitrary missing scenarios. We carry out experiments on two public brain MRI datasets for synthesis and downstream segmentation tasks. Experimental results demonstrate that our M2DN outperforms the state-of-the-art models significantly and shows great generalizability for arbitrary missing modalities.
Collapse
|
38
|
Broll A, Goldhacker M, Hahnel S, Rosentritt M. Generative deep learning approaches for the design of dental restorations: A narrative review. J Dent 2024; 145:104988. [PMID: 38608832 DOI: 10.1016/j.jdent.2024.104988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 03/13/2024] [Accepted: 04/03/2024] [Indexed: 04/14/2024] Open
Abstract
OBJECTIVES This study aims to explore and discuss recent advancements in tooth reconstruction utilizing deep learning (DL) techniques. A review on new DL methodologies in partial and full tooth reconstruction is conducted. DATA/SOURCES PubMed, Google Scholar, and IEEE Xplore databases were searched for articles from 2003 to 2023. STUDY SELECTION The review includes 9 articles published from 2018 to 2023. The selected articles showcase novel DL approaches for tooth reconstruction, while those concentrating solely on the application or review of DL methods are excluded. The review shows that data is acquired via intraoral scans or laboratory scans of dental plaster models. Common data representations are depth maps, point clouds, and voxelized point clouds. Reconstructions focus on single teeth, using data from adjacent teeth or the entire jaw. Some articles include antagonist teeth data and features like occlusal grooves and gap distance. Primary network architectures include Generative Adversarial Networks (GANs) and Transformers. Compared to conventional digital methods, DL-based tooth reconstruction reports error rates approximately two times lower. CONCLUSIONS Generative DL models analyze dental datasets to reconstruct missing teeth by extracting insights into patterns and structures. Through specialized application, these models reconstruct morphologically and functionally sound dental structures, leveraging information from the existing teeth. The reported advancements facilitate the feasibility of DL-based dental crown reconstruction. Beyond GANs and Transformers with point clouds or voxels, recent studies indicate promising outcomes with diffusion-based architectures and innovative data representations like wavelets for 3D shape completion and inference problems. CLINICAL SIGNIFICANCE Generative network architectures employed in the analysis and reconstruction of dental structures demonstrate notable proficiency. The enhanced accuracy and efficiency of DL-based frameworks hold the potential to enhance clinical outcomes and increase patient satisfaction. The reduced reconstruction times and diminished requirement for manual intervention may lead to cost savings and improved accessibility of dental services.
Collapse
Affiliation(s)
- Alexander Broll
- Department of Prosthetic Dentistry, University Hospital Regensburg, Regensburg, Germany
| | - Markus Goldhacker
- Faculty of Mechanical Engineering, OTH Regensburg, Regensburg, Germany
| | - Sebastian Hahnel
- Department of Prosthetic Dentistry, University Hospital Regensburg, Regensburg, Germany
| | - Martin Rosentritt
- Department of Prosthetic Dentistry, University Hospital Regensburg, Regensburg, Germany
| |
Collapse
|
39
|
Lu X, Liang X, Liu W, Miao X, Guan X. ReeGAN: MRI image edge-preserving synthesis based on GANs trained with misaligned data. Med Biol Eng Comput 2024; 62:1851-1868. [PMID: 38396277 DOI: 10.1007/s11517-024-03035-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 01/27/2024] [Indexed: 02/25/2024]
Abstract
As a crucial medical examination technique, different modalities of magnetic resonance imaging (MRI) complement each other, offering multi-angle and multi-dimensional insights into the body's internal information. Therefore, research on MRI cross-modality conversion is of great significance, and many innovative techniques have been explored. However, most methods are trained on well-aligned data, and the impact of misaligned data has not received sufficient attention. Additionally, many methods focus on transforming the entire image and ignore crucial edge information. To address these challenges, we propose a generative adversarial network based on multi-feature fusion, which effectively preserves edge information while training on noisy data. Notably, we consider images with limited range random transformations as noisy labels and use an additional small auxiliary registration network to help the generator adapt to the noise distribution. Moreover, we inject auxiliary edge information to improve the quality of synthesized target modality images. Our goal is to find the best solution for cross-modality conversion. Comprehensive experiments and ablation studies demonstrate the effectiveness of the proposed method.
Collapse
Affiliation(s)
- Xiangjiang Lu
- Guangxi Key Lab of Multi-Source Information Mining & Security, School of Computer Science and Engineering & School of Software, Guangxi Normal University, Guilin, 541004, China.
| | - Xiaoshuang Liang
- Guangxi Key Lab of Multi-Source Information Mining & Security, School of Computer Science and Engineering & School of Software, Guangxi Normal University, Guilin, 541004, China
| | - Wenjing Liu
- Guangxi Key Lab of Multi-Source Information Mining & Security, School of Computer Science and Engineering & School of Software, Guangxi Normal University, Guilin, 541004, China
| | - Xiuxia Miao
- Guangxi Key Lab of Multi-Source Information Mining & Security, School of Computer Science and Engineering & School of Software, Guangxi Normal University, Guilin, 541004, China
| | - Xianglong Guan
- Guangxi Key Lab of Multi-Source Information Mining & Security, School of Computer Science and Engineering & School of Software, Guangxi Normal University, Guilin, 541004, China
| |
Collapse
|
40
|
Li L, Yu J, Li Y, Wei J, Fan R, Wu D, Ye Y. Multi-sequence generative adversarial network: better generation for enhanced magnetic resonance imaging images. Front Comput Neurosci 2024; 18:1365238. [PMID: 38841427 PMCID: PMC11151883 DOI: 10.3389/fncom.2024.1365238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Accepted: 03/27/2024] [Indexed: 06/07/2024] Open
Abstract
Introduction MRI is one of the commonly used diagnostic methods in clinical practice, especially in brain diseases. There are many sequences in MRI, but T1CE images can only be obtained by using contrast agents. Many patients (such as cancer patients) must undergo alignment of multiple MRI sequences for diagnosis, especially the contrast-enhanced magnetic resonance sequence. However, some patients such as pregnant women, children, etc. find it difficult to use contrast agents to obtain enhanced sequences, and contrast agents have many adverse reactions, which can pose a significant risk. With the continuous development of deep learning, the emergence of generative adversarial networks makes it possible to extract features from one type of image to generate another type of image. Methods We propose a generative adversarial network model with multimodal inputs and end-to-end decoding based on the pix2pix model. For the pix2pix model, we used four evaluation metrics: NMSE, RMSE, SSIM, and PNSR to assess the effectiveness of our generated model. Results Through statistical analysis, we compared our proposed new model with pix2pix and found significant differences between the two. Our model outperformed pix2pix, with higher SSIM and PNSR, lower NMSE and RMSE. We also found that the input of T1W images and T2W images had better effects than other combinations, providing new ideas for subsequent work on generating magnetic resonance enhancement sequence images. By using our model, it is possible to generate magnetic resonance enhanced sequence images based on magnetic resonance non-enhanced sequence images. Discussion This has significant implications as it can greatly reduce the use of contrast agents to protect populations such as pregnant women and children who are contraindicated for contrast agents. Additionally, contrast agents are relatively expensive, and this generation method may bring about substantial economic benefits.
Collapse
Affiliation(s)
- Leizi Li
- South China Normal University-Panyu Central Hospital Joint Laboratory of Basic and Translational Medical Research, Guangzhou Panyu Central Hospital, Guangzhou, China
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring and Guangdong Provincial Engineering Technology Research Center for Drug and Food Biological Resources Processing and Comprehensive Utilization, School of Life Sciences, South China Normal University, Guangzhou, China
| | - Jingchun Yu
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring and Guangdong Provincial Engineering Technology Research Center for Drug and Food Biological Resources Processing and Comprehensive Utilization, School of Life Sciences, South China Normal University, Guangzhou, China
| | - Yijin Li
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring and Guangdong Provincial Engineering Technology Research Center for Drug and Food Biological Resources Processing and Comprehensive Utilization, School of Life Sciences, South China Normal University, Guangzhou, China
| | - Jinbo Wei
- South China Normal University-Panyu Central Hospital Joint Laboratory of Basic and Translational Medical Research, Guangzhou Panyu Central Hospital, Guangzhou, China
| | - Ruifang Fan
- Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring and Guangdong Provincial Engineering Technology Research Center for Drug and Food Biological Resources Processing and Comprehensive Utilization, School of Life Sciences, South China Normal University, Guangzhou, China
| | - Dieen Wu
- South China Normal University-Panyu Central Hospital Joint Laboratory of Basic and Translational Medical Research, Guangzhou Panyu Central Hospital, Guangzhou, China
| | - Yufeng Ye
- South China Normal University-Panyu Central Hospital Joint Laboratory of Basic and Translational Medical Research, Guangzhou Panyu Central Hospital, Guangzhou, China
- Medical Imaging Institute of Panyu, Guangzhou, China
| |
Collapse
|
41
|
Dai X, Ma N, Du L, Wang X, Ju Z, Jie C, Gong H, Ge R, Yu W, Qu B. Application of MR images in radiotherapy planning for brain tumor based on deep learning. Int J Neurosci 2024:1-11. [PMID: 38712669 DOI: 10.1080/00207454.2024.2352784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 05/03/2024] [Indexed: 05/08/2024]
Abstract
PURPOSE Explore the function and dose calculation accuracy of MRI images in radiotherapy planning through deep learning methods. METHODS 131 brain tumor patients undergoing radiotherapy with previous MR and CT images were recruited for this study. A new series of MRI from the aligned MR was firstly registered to CT images strictly using MIM software and then resampled. A deep learning method (U-NET) was used to establish a MRI-to-CT conversion model, for which 105 patient images were used as the training set and 26 patient images were used as the tuning set. Data from additional 8 patients were collected as the test set, and the accuracy of the model was evaluated from a dosimetric standpoint. RESULTS Comparing the synthetic CT images with the original CT images, the difference in dosimetric parameters D98, D95, D2 and Dmean of PTV in 8 patients was less than 0.5%. The gamma passed rates of PTV and whole body volume were: 1%/1 mm: 93.96%±6.75%, 2%/2 mm: 99.87%±0.30%, 3%/3 mm: 100.00%±0.00%; and 1%/1 mm: 99.14%±0.80%, 2%/2 mm: 99.92%±0.08%, 3%/3 mm: 99.99%±0.01%. CONCLUSION MR images can be used both in delineation and treatment efficacy evaluation and in dose calculation. Using the deep learning way to convert MR image to CT image is a viable method and can be further used in dose calculation.
Collapse
Affiliation(s)
- Xiangkun Dai
- Department of Radiotherapy, First Medical Center of PLA General Hospital, Beijing, China
| | - Na Ma
- Department of Radiotherapy, First Medical Center of PLA General Hospital, Beijing, China
- School of Biological Science and Medical Engineering, Beihang, University, Beijing, China
| | - Lehui Du
- Department of Radiotherapy, First Medical Center of PLA General Hospital, Beijing, China
| | | | - Zhongjian Ju
- Department of Radiotherapy, First Medical Center of PLA General Hospital, Beijing, China
| | - Chuanbin Jie
- Department of Radiotherapy, First Medical Center of PLA General Hospital, Beijing, China
| | - Hanshun Gong
- Department of Radiotherapy, First Medical Center of PLA General Hospital, Beijing, China
| | - Ruigang Ge
- Department of Radiotherapy, First Medical Center of PLA General Hospital, Beijing, China
| | - Wei Yu
- Department of Radiotherapy, First Medical Center of PLA General Hospital, Beijing, China
| | - Baolin Qu
- Department of Radiotherapy, First Medical Center of PLA General Hospital, Beijing, China
| |
Collapse
|
42
|
Dalmaz O, Mirza MU, Elmas G, Ozbey M, Dar SUH, Ceyani E, Oguz KK, Avestimehr S, Çukur T. One model to unite them all: Personalized federated learning of multi-contrast MRI synthesis. Med Image Anal 2024; 94:103121. [PMID: 38402791 DOI: 10.1016/j.media.2024.103121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 02/20/2024] [Accepted: 02/21/2024] [Indexed: 02/27/2024]
Abstract
Curation of large, diverse MRI datasets via multi-institutional collaborations can help improve learning of generalizable synthesis models that reliably translate source- onto target-contrast images. To facilitate collaborations, federated learning (FL) adopts decentralized model training while mitigating privacy concerns by avoiding sharing of imaging data. However, conventional FL methods can be impaired by the inherent heterogeneity in the data distribution, with domain shifts evident within and across imaging sites. Here we introduce the first personalized FL method for MRI Synthesis (pFLSynth) that improves reliability against data heterogeneity via model specialization to individual sites and synthesis tasks (i.e., source-target contrasts). To do this, pFLSynth leverages an adversarial model equipped with novel personalization blocks that control the statistics of generated feature maps across the spatial/channel dimensions, given latent variables specific to sites and tasks. To further promote communication efficiency and site specialization, partial network aggregation is employed over later generator stages while earlier generator stages and the discriminator are trained locally. As such, pFLSynth enables multi-task training of multi-site synthesis models with high generalization performance across sites and tasks. Comprehensive experiments demonstrate the superior performance and reliability of pFLSynth in MRI synthesis against prior federated methods.
Collapse
Affiliation(s)
- Onat Dalmaz
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara 06800, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara 06800, Turkey
| | - Muhammad U Mirza
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara 06800, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara 06800, Turkey
| | - Gokberk Elmas
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara 06800, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara 06800, Turkey
| | - Muzaffer Ozbey
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara 06800, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara 06800, Turkey
| | - Salman U H Dar
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara 06800, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara 06800, Turkey
| | - Emir Ceyani
- Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA 90089, USA
| | - Kader K Oguz
- Department of Radiology, University of California, Davis Medical Center, Sacramento, CA 95817, USA
| | - Salman Avestimehr
- Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA 90089, USA
| | - Tolga Çukur
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara 06800, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara 06800, Turkey; Neuroscience Program, Bilkent University, Ankara 06800, Turkey.
| |
Collapse
|
43
|
Huynh N, Yan D, Ma Y, Wu S, Long C, Sami MT, Almudaifer A, Jiang Z, Chen H, Dretsch MN, Denney TS, Deshpande R, Deshpande G. The Use of Generative Adversarial Network and Graph Convolution Network for Neuroimaging-Based Diagnostic Classification. Brain Sci 2024; 14:456. [PMID: 38790434 PMCID: PMC11119064 DOI: 10.3390/brainsci14050456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 04/21/2024] [Accepted: 04/23/2024] [Indexed: 05/26/2024] Open
Abstract
Functional connectivity (FC) obtained from resting-state functional magnetic resonance imaging has been integrated with machine learning algorithms to deliver consistent and reliable brain disease classification outcomes. However, in classical learning procedures, custom-built specialized feature selection techniques are typically used to filter out uninformative features from FC patterns to generalize efficiently on the datasets. The ability of convolutional neural networks (CNN) and other deep learning models to extract informative features from data with grid structure (such as images) has led to the surge in popularity of these techniques. However, the designs of many existing CNN models still fail to exploit the relationships between entities of graph-structure data (such as networks). Therefore, graph convolution network (GCN) has been suggested as a means for uncovering the intricate structure of brain network data, which has the potential to substantially improve classification accuracy. Furthermore, overfitting in classifiers can be largely attributed to the limited number of available training samples. Recently, the generative adversarial network (GAN) has been widely used in the medical field for its generative aspect that can generate synthesis images to cope with the problems of data scarcity and patient privacy. In our previous work, GCN and GAN have been designed to investigate FC patterns to perform diagnosis tasks, and their effectiveness has been tested on the ABIDE-I dataset. In this paper, the models will be further applied to FC data derived from more public datasets (ADHD, ABIDE-II, and ADNI) and our in-house dataset (PTSD) to justify their generalization on all types of data. The results of a number of experiments show the powerful characteristic of GAN to mimic FC data to achieve high performance in disease prediction. When employing GAN for data augmentation, the diagnostic accuracy across ADHD-200, ABIDE-II, and ADNI datasets surpasses that of other machine learning models, including results achieved with BrainNetCNN. Specifically, in ADHD, the accuracy increased from 67.74% to 73.96% with GAN, in ABIDE-II from 70.36% to 77.40%, and in ADNI, reaching 52.84% and 88.56% for multiclass and binary classification, respectively. GCN also obtains decent results, with the best accuracy in ADHD datasets at 71.38% for multinomial and 75% for binary classification, respectively, and the second-best accuracy in the ABIDE-II dataset (72.28% and 75.16%, respectively). Both GAN and GCN achieved the highest accuracy for the PTSD dataset, reaching 97.76%. However, there are still some limitations that can be improved. Both methods have many opportunities for the prediction and diagnosis of diseases.
Collapse
Affiliation(s)
- Nguyen Huynh
- Auburn University Neuroimaging Center, Department of Electrical and Computer Engineering, Auburn University, Auburn, AL 36849, USA; (N.H.); (T.S.D.)
| | - Da Yan
- Department of Computer Sciences, Indiana University Bloomington, Bloomington, IN 47405, USA;
| | - Yueen Ma
- Department of Computer Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong;
| | - Shengbin Wu
- Department of Mechanical Engineering, University of California, Berkeley, CA 94720, USA;
| | - Cheng Long
- School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore;
| | - Mirza Tanzim Sami
- Department of Computer Sciences, University of Alabama at Birmingham, Birmingham, AL 35294, USA; (M.T.S.); (A.A.)
| | - Abdullateef Almudaifer
- Department of Computer Sciences, University of Alabama at Birmingham, Birmingham, AL 35294, USA; (M.T.S.); (A.A.)
- College of Computer Science and Engineering, Taibah University, Yanbu 41477, Saudi Arabia
| | - Zhe Jiang
- Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL 32611, USA;
| | - Haiquan Chen
- Department of Computer Sciences, California State University, Sacramento, CA 95819, USA;
| | - Michael N. Dretsch
- Walter Reed Army Institute of Research-West, Joint Base Lewis-McChord, WA 98433, USA;
| | - Thomas S. Denney
- Auburn University Neuroimaging Center, Department of Electrical and Computer Engineering, Auburn University, Auburn, AL 36849, USA; (N.H.); (T.S.D.)
- Department of Psychological Sciences, Auburn University, Auburn, AL 36849, USA
- Alabama Advanced Imaging Consortium, Birmingham, AL 36849, USA
- Center for Neuroscience, Auburn University, Auburn, AL 36849, USA
| | - Rangaprakash Deshpande
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Harvard Medical School, Charlestown, MA 02129, USA;
| | - Gopikrishna Deshpande
- Auburn University Neuroimaging Center, Department of Electrical and Computer Engineering, Auburn University, Auburn, AL 36849, USA; (N.H.); (T.S.D.)
- Department of Psychological Sciences, Auburn University, Auburn, AL 36849, USA
- Alabama Advanced Imaging Consortium, Birmingham, AL 36849, USA
- Center for Neuroscience, Auburn University, Auburn, AL 36849, USA
- Department of Psychiatry, National Institute of Mental Health and Neurosciences, Bangalore 560030, India
- Department of Heritage Science and Technology, Indian Institute of Technology, Hyderabad 502285, India
| |
Collapse
|
44
|
Fan M, Cao X, Lü F, Xie S, Yu Z, Chen Y, Lü Z, Li L. Generative adversarial network-based synthesis of contrast-enhanced MR images from precontrast images for predicting histological characteristics in breast cancer. Phys Med Biol 2024; 69:095002. [PMID: 38537294 DOI: 10.1088/1361-6560/ad3889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 03/27/2024] [Indexed: 04/16/2024]
Abstract
Objective. Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) is a sensitive tool for assessing breast cancer by analyzing tumor blood flow, but it requires gadolinium-based contrast agents, which carry risks such as brain retention and astrocyte migration. Contrast-free MRI is thus preferable for patients with renal impairment or who are pregnant. This study aimed to investigate the feasibility of generating contrast-enhanced MR images from precontrast images and to evaluate the potential use of synthetic images in diagnosing breast cancer.Approach. This retrospective study included 322 women with invasive breast cancer who underwent preoperative DCE-MRI. A generative adversarial network (GAN) based postcontrast image synthesis (GANPIS) model with perceptual loss was proposed to generate contrast-enhanced MR images from precontrast images. The quality of the synthesized images was evaluated using the peak signal-to-noise ratio (PSNR) and structural similarity (SSIM). The diagnostic performance of the generated images was assessed using a convolutional neural network to predict Ki-67, luminal A and histological grade with the area under the receiver operating characteristic curve (AUC). The patients were divided into training (n= 200), validation (n= 60), and testing sets (n= 62).Main results. Quantitative analysis revealed strong agreement between the generated and real postcontrast images in the test set, with PSNR and SSIM values of 36.210 ± 2.670 and 0.988 ± 0.006, respectively. The generated postcontrast images achieved AUCs of 0.918 ± 0.018, 0.842 ± 0.028 and 0.815 ± 0.019 for predicting the Ki-67 expression level, histological grade, and luminal A subtype, respectively. These results showed a significant improvement compared to the use of precontrast images alone, which achieved AUCs of 0.764 ± 0.031, 0.741 ± 0.035, and 0.797 ± 0.021, respectively.Significance. This study proposed a GAN-based MR image synthesis method for breast cancer that aims to generate postcontrast images from precontrast images, allowing the use of contrast-free images to simulate kinetic features for improved diagnosis.
Collapse
Affiliation(s)
- Ming Fan
- Institute of Intelligent Biomedicine, Hangzhou Dianzi University,Hangzhou 310018, Zhejiang, People's Republic of China
| | - Xuan Cao
- Institute of Intelligent Biomedicine, Hangzhou Dianzi University,Hangzhou 310018, Zhejiang, People's Republic of China
| | - Fuqing Lü
- Institute of Intelligent Biomedicine, Hangzhou Dianzi University,Hangzhou 310018, Zhejiang, People's Republic of China
| | - Sangma Xie
- Institute of Intelligent Biomedicine, Hangzhou Dianzi University,Hangzhou 310018, Zhejiang, People's Republic of China
| | - Zhou Yu
- Institute of Intelligent Biomedicine, Hangzhou Dianzi University,Hangzhou 310018, Zhejiang, People's Republic of China
| | - Yuanlin Chen
- Institute of Intelligent Biomedicine, Hangzhou Dianzi University,Hangzhou 310018, Zhejiang, People's Republic of China
| | - Zhong Lü
- Affiliated Dongyang Hospital of Wenzhou Medical University,People's Republic of China
| | - Lihua Li
- Institute of Intelligent Biomedicine, Hangzhou Dianzi University,Hangzhou 310018, Zhejiang, People's Republic of China
| |
Collapse
|
45
|
Fu L, Li X, Cai X, Miao D, Yao Y, Shen Y. Energy-guided diffusion model for CBCT-to-CT synthesis. Comput Med Imaging Graph 2024; 113:102344. [PMID: 38320336 DOI: 10.1016/j.compmedimag.2024.102344] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 01/29/2024] [Accepted: 01/29/2024] [Indexed: 02/08/2024]
Abstract
Cone Beam Computed Tomography (CBCT) plays a crucial role in Image-Guided Radiation Therapy (IGRT), providing essential assurance of accuracy in radiation treatment by monitoring changes in anatomical structures during the treatment process. However, CBCT images often face interference from scatter noise and artifacts, posing a significant challenge when relying solely on CBCT for precise dose calculation and accurate tissue localization. There is an urgent need to enhance the quality of CBCT images, enabling a more practical application in IGRT. This study introduces EGDiff, a novel framework based on the diffusion model, designed to address the challenges posed by scatter noise and artifacts in CBCT images. In our approach, we employ a forward diffusion process by adding Gaussian noise to CT images, followed by a reverse denoising process using ResUNet with an attention mechanism to predict noise intensity, ultimately synthesizing CBCT-to-CT images. Additionally, we design an energy-guided function to retain domain-independent features and discard domain-specific features during the denoising process, enhancing the effectiveness of CBCT-CT generation. We conduct numerous experiments on the thorax dataset and pancreas dataset. The results demonstrate that EGDiff performs better on the thoracic tumor dataset with SSIM of 0.850, MAE of 26.87 HU, PSNR of 19.83 dB, and NCC of 0.874. EGDiff outperforms SoTA CBCT-to-CT synthesis methods on the pancreas dataset with SSIM of 0.754, MAE of 32.19 HU, PSNR of 19.35 dB, and NCC of 0.846. By improving the accuracy and reliability of CBCT images, EGDiff can enhance the precision of radiation therapy, minimize radiation exposure to healthy tissues, and ultimately contribute to more effective and personalized cancer treatment strategies.
Collapse
Affiliation(s)
- Linjie Fu
- Chengdu Computer Application Institute Chinese Academy of Sciences, China; University of the Chinese Academy of Sciences, China.
| | - Xia Li
- Radiophysical Technology Center, Cancer Center, West China Hospital, Sichuan University, China.
| | - Xiuding Cai
- Chengdu Computer Application Institute Chinese Academy of Sciences, China; University of the Chinese Academy of Sciences, China.
| | - Dong Miao
- Chengdu Computer Application Institute Chinese Academy of Sciences, China; University of the Chinese Academy of Sciences, China.
| | - Yu Yao
- Chengdu Computer Application Institute Chinese Academy of Sciences, China; University of the Chinese Academy of Sciences, China.
| | - Yali Shen
- Sichuan University West China Hospital Department of Abdominal Oncology, China.
| |
Collapse
|
46
|
Hussein R, Shin D, Zhao MY, Guo J, Davidzon G, Steinberg G, Moseley M, Zaharchuk G. Turning brain MRI into diagnostic PET: 15O-water PET CBF synthesis from multi-contrast MRI via attention-based encoder-decoder networks. Med Image Anal 2024; 93:103072. [PMID: 38176356 PMCID: PMC10922206 DOI: 10.1016/j.media.2023.103072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 12/20/2023] [Accepted: 12/20/2023] [Indexed: 01/06/2024]
Abstract
Accurate quantification of cerebral blood flow (CBF) is essential for the diagnosis and assessment of a wide range of neurological diseases. Positron emission tomography (PET) with radiolabeled water (15O-water) is the gold-standard for the measurement of CBF in humans, however, it is not widely available due to its prohibitive costs and the use of short-lived radiopharmaceutical tracers that require onsite cyclotron production. Magnetic resonance imaging (MRI), in contrast, is more accessible and does not involve ionizing radiation. This study presents a convolutional encoder-decoder network with attention mechanisms to predict the gold-standard 15O-water PET CBF from multi-contrast MRI scans, thus eliminating the need for radioactive tracers. The model was trained and validated using 5-fold cross-validation in a group of 126 subjects consisting of healthy controls and cerebrovascular disease patients, all of whom underwent simultaneous 15O-water PET/MRI. The results demonstrate that the model can successfully synthesize high-quality PET CBF measurements (with an average SSIM of 0.924 and PSNR of 38.8 dB) and is more accurate compared to concurrent and previous PET synthesis methods. We also demonstrate the clinical significance of the proposed algorithm by evaluating the agreement for identifying the vascular territories with impaired CBF. Such methods may enable more widespread and accurate CBF evaluation in larger cohorts who cannot undergo PET imaging due to radiation concerns, lack of access, or logistic challenges.
Collapse
Affiliation(s)
- Ramy Hussein
- Radiological Sciences Laboratory, Department of Radiology, Stanford University, Stanford, CA 94305, USA.
| | - David Shin
- Global MR Applications & Workflow, GE Healthcare, Menlo Park, CA 94025, USA
| | - Moss Y Zhao
- Radiological Sciences Laboratory, Department of Radiology, Stanford University, Stanford, CA 94305, USA; Stanford Cardiovascular Institute, Stanford University, Stanford, CA 94305, USA
| | - Jia Guo
- Department of Bioengineering, University of California, Riverside, CA 92521, USA
| | - Guido Davidzon
- Division of Nuclear Medicine, Department of Radiology, Stanford University, Stanford, CA 94305, USA
| | - Gary Steinberg
- Department of Neurosurgery, Stanford University, Stanford, CA 94304, USA
| | - Michael Moseley
- Radiological Sciences Laboratory, Department of Radiology, Stanford University, Stanford, CA 94305, USA
| | - Greg Zaharchuk
- Radiological Sciences Laboratory, Department of Radiology, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
47
|
Jiang M, Wang S, Song Z, Song L, Wang Y, Zhu C, Zheng Q. Cross 2SynNet: cross-device-cross-modal synthesis of routine brain MRI sequences from CT with brain lesion. MAGMA (NEW YORK, N.Y.) 2024; 37:241-256. [PMID: 38315352 DOI: 10.1007/s10334-023-01145-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 11/28/2023] [Accepted: 12/27/2023] [Indexed: 02/07/2024]
Abstract
OBJECTIVES CT and MR are often needed to determine the location and extent of brain lesions collectively to improve diagnosis. However, patients with acute brain diseases cannot complete the MRI examination within a short time. The aim of the study is to devise a cross-device and cross-modal medical image synthesis (MIS) method Cross2SynNet for synthesizing routine brain MRI sequences of T1WI, T2WI, FLAIR, and DWI from CT with stroke and brain tumors. MATERIALS AND METHODS For the retrospective study, the participants covered four different diseases of cerebral ischemic stroke (CIS-cohort), cerebral hemorrhage (CH-cohort), meningioma (M-cohort), glioma (G-cohort). The MIS model Cross2SynNet was established on the basic architecture of conditional generative adversarial network (CGAN), of which, the fully convolutional Transformer (FCT) module was adopted into generator to capture the short- and long-range dependencies between healthy and pathological tissues, and the edge loss function was to minimize the difference in gradient magnitude between synthetic image and ground truth. Three metrics of mean square error (MSE), peak signal-to-noise ratio (PSNR), and structure similarity index measure (SSIM) were used for evaluation. RESULTS A total of 230 participants (mean patient age, 59.77 years ± 13.63 [standard deviation]; 163 men [71%] and 67 women [29%]) were included, including CIS-cohort (95 participants between Dec 2019 and Feb 2022), CH-cohort (69 participants between Jan 2020 and Dec 2021), M-cohort (40 participants between Sep 2018 and Dec 2021), and G-cohort (26 participants between Sep 2019 and Dec 2021). The Cross2SynNet achieved averaged values of MSE = 0.008, PSNR = 21.728, and SSIM = 0.758 when synthesizing MRIs from CT, outperforming the CycleGAN, pix2pix, RegGAN, Pix2PixHD, and ResViT. The Cross2SynNet could synthesize the brain lesion on pseudo DWI even if the CT image did not exhibit clear signal in the acute ischemic stroke patients. CONCLUSIONS Cross2SynNet could achieve routine brain MRI synthesis of T1WI, T2WI, FLAIR, and DWI from CT with promising performance given the brain lesion of stroke and brain tumor.
Collapse
Affiliation(s)
- Minbo Jiang
- School of Computer and Control Engineering, Yantai University, No 30, Qingquan Road, Laishan District, Yantai, 264005, Shandong, China
| | - Shuai Wang
- Department of Radiology, Binzhou Medical University Hospital, Binzhou, 256603, China
| | - Zhiwei Song
- School of Computer and Control Engineering, Yantai University, No 30, Qingquan Road, Laishan District, Yantai, 264005, Shandong, China
| | - Limei Song
- School of Medical Imaging, Weifang Medical University, Weifang, 261000, China
| | - Yi Wang
- School of Computer and Control Engineering, Yantai University, No 30, Qingquan Road, Laishan District, Yantai, 264005, Shandong, China
| | - Chuanzhen Zhu
- School of Computer and Control Engineering, Yantai University, No 30, Qingquan Road, Laishan District, Yantai, 264005, Shandong, China
| | - Qiang Zheng
- School of Computer and Control Engineering, Yantai University, No 30, Qingquan Road, Laishan District, Yantai, 264005, Shandong, China.
| |
Collapse
|
48
|
Bottani S, Thibeau-Sutre E, Maire A, Ströer S, Dormont D, Colliot O, Burgos N. Contrast-enhanced to non-contrast-enhanced image translation to exploit a clinical data warehouse of T1-weighted brain MRI. BMC Med Imaging 2024; 24:67. [PMID: 38504179 PMCID: PMC10953143 DOI: 10.1186/s12880-024-01242-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 03/07/2024] [Indexed: 03/21/2024] Open
Abstract
BACKGROUND Clinical data warehouses provide access to massive amounts of medical images, but these images are often heterogeneous. They can for instance include images acquired both with or without the injection of a gadolinium-based contrast agent. Harmonizing such data sets is thus fundamental to guarantee unbiased results, for example when performing differential diagnosis. Furthermore, classical neuroimaging software tools for feature extraction are typically applied only to images without gadolinium. The objective of this work is to evaluate how image translation can be useful to exploit a highly heterogeneous data set containing both contrast-enhanced and non-contrast-enhanced images from a clinical data warehouse. METHODS We propose and compare different 3D U-Net and conditional GAN models to convert contrast-enhanced T1-weighted (T1ce) into non-contrast-enhanced (T1nce) brain MRI. These models were trained using 230 image pairs and tested on 77 image pairs from the clinical data warehouse of the Greater Paris area. RESULTS Validation using standard image similarity measures demonstrated that the similarity between real and synthetic T1nce images was higher than between real T1nce and T1ce images for all the models compared. The best performing models were further validated on a segmentation task. We showed that tissue volumes extracted from synthetic T1nce images were closer to those of real T1nce images than volumes extracted from T1ce images. CONCLUSION We showed that deep learning models initially developed with research quality data could synthesize T1nce from T1ce images of clinical quality and that reliable features could be extracted from the synthetic images, thus demonstrating the ability of such methods to help exploit a data set coming from a clinical data warehouse.
Collapse
Affiliation(s)
- Simona Bottani
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, CNRS, Inria, Inserm, AP-HP, Hôpital de la Pitié-Salpêtrière, Paris, 75013, France
| | - Elina Thibeau-Sutre
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, CNRS, Inria, Inserm, AP-HP, Hôpital de la Pitié-Salpêtrière, Paris, 75013, France
| | - Aurélien Maire
- Innovation & Données - Département des Services Numériques, AP-HP, Paris, 75013, France
| | - Sebastian Ströer
- Hôpital Pitié Salpêtrière, Department of Neuroradiology, AP-HP, Paris, 75012, France
| | - Didier Dormont
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, CNRS, Inria, Inserm, AP-HP, Hôpital de la Pitié-Salpêtrière, DMU DIAMENT, Paris, 75013, France
| | - Olivier Colliot
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, CNRS, Inria, Inserm, AP-HP, Hôpital de la Pitié-Salpêtrière, Paris, 75013, France
| | - Ninon Burgos
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, CNRS, Inria, Inserm, AP-HP, Hôpital de la Pitié-Salpêtrière, Paris, 75013, France.
| |
Collapse
|
49
|
Li B, Hu W, Feng CM, Li Y, Liu Z, Xu Y. Multi-Contrast Complementary Learning for Accelerated MR Imaging. IEEE J Biomed Health Inform 2024; 28:1436-1447. [PMID: 38157466 DOI: 10.1109/jbhi.2023.3348328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2024]
Abstract
Thanks to its powerful ability to depict high-resolution anatomical information, magnetic resonance imaging (MRI) has become an essential non-invasive scanning technique in clinical practice. However, excessive acquisition time often leads to the degradation of image quality and psychological discomfort among subjects, hindering its further popularization. Besides reconstructing images from the undersampled protocol itself, multi-contrast MRI protocols bring promising solutions by leveraging additional morphological priors for the target modality. Nevertheless, previous multi-contrast techniques mainly adopt a simple fusion mechanism that inevitably ignores valuable knowledge. In this work, we propose a novel multi-contrast complementary information aggregation network named MCCA, aiming to exploit available complementary representations fully to reconstruct the undersampled modality. Specifically, a multi-scale feature fusion mechanism has been introduced to incorporate complementary-transferable knowledge into the target modality. Moreover, a hybrid convolution transformer block was developed to extract global-local context dependencies simultaneously, which combines the advantages of CNNs while maintaining the merits of Transformers. Compared to existing MRI reconstruction methods, the proposed method has demonstrated its superiority through extensive experiments on different datasets under different acceleration factors and undersampling patterns.
Collapse
|
50
|
Dai F, Li Y, Zhu Y, Li B, Shi Q, Chen Y, Ta D. B-mode ultrasound to elastography synthesis using multiscale learning. ULTRASONICS 2024; 138:107268. [PMID: 38402836 DOI: 10.1016/j.ultras.2024.107268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 02/03/2024] [Accepted: 02/09/2024] [Indexed: 02/27/2024]
Abstract
Elastography is a promising diagnostic tool that measures the hardness of tissues, and it has been used in clinics for detecting lesion progress, such as benign and malignant tumors. However, due to the high cost of examination and limited availability of elastic ultrasound devices, elastography is not widely used in primary medical facilities in rural areas. To address this issue, a deep learning approach called the multiscale elastic image synthesis network (MEIS-Net) was proposed, which utilized the multiscale learning to synthesize elastic images from ultrasound data instead of traditional ultrasound elastography in virtue of elastic deformation. The method integrates multi-scale features of the prostate in an innovative way and enhances the elastic synthesis effect through a fusion module. The module obtains B-mode ultrasound and elastography feature maps, which are used to generate local and global elastic ultrasound images through their correspondence. Finally, the two-channel images are synthesized into output elastic images. To evaluate the approach, quantitative assessments and diagnostic tests were conducted, comparing the results of MEIS-Net with several deep learning-based methods. The experiments showed that MEIS-Net was effective in synthesizing elastic images from B-mode ultrasound data acquired from two different devices, with a structural similarity index of 0.74 ± 0.04. This outperformed other methods such as Pix2Pix (0.69 ± 0.09), CycleGAN (0.11 ± 0.27), and StarGANv2 (0.02 ± 0.01). Furthermore, the diagnostic tests demonstrated that the classification performance of the synthetic elastic image was comparable to that of real elastic images, with only a 3 % decrease in the area under the curve (AUC), indicating the clinical effectiveness of the proposed method.
Collapse
Affiliation(s)
- Fei Dai
- Center for Biomedical Engineering, School of Information Science and Technology, Fudan University, Shanghai 200433, China
| | - Yifang Li
- Academy for Engineering and Technology, Fudan University, Shanghai 200433, China
| | - Yunkai Zhu
- Department of Ultrasound, Xinhua Hospital Affiliated to Shanghai Jiaotong University School of Medicine, Shanghai 200092, China
| | - Boyi Li
- Academy for Engineering and Technology, Fudan University, Shanghai 200433, China
| | - Qinzhen Shi
- Center for Biomedical Engineering, School of Information Science and Technology, Fudan University, Shanghai 200433, China
| | - Yaqing Chen
- Department of Ultrasound, Xinhua Hospital Affiliated to Shanghai Jiaotong University School of Medicine, Shanghai 200092, China.
| | - Dean Ta
- Center for Biomedical Engineering, School of Information Science and Technology, Fudan University, Shanghai 200433, China.
| |
Collapse
|