1
|
Chen S, Zhang R, Liang H, Qian Y, Zhou X. Coupling of state space modules and attention mechanisms: An input-aware multi-contrast MRI synthesis method. Med Phys 2025; 52:2269-2278. [PMID: 39714363 DOI: 10.1002/mp.17598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2024] [Revised: 11/19/2024] [Accepted: 12/05/2024] [Indexed: 12/24/2024] Open
Abstract
BACKGROUND Medical imaging plays a pivotal role in the real-time monitoring of patients during the diagnostic and therapeutic processes. However, in clinical scenarios, the acquisition of multi-modal imaging protocols is often impeded by a number of factors, including time and economic costs, the cooperation willingness of patients, imaging quality, and even safety concerns. PURPOSE We proposed a learning-based medical image synthesis method to simplify the acquisition of multi-contrast MRI. METHODS We redesigned the basic structure of the Mamba block and explored different integration patterns between Mamba layers and Transformer layers to make it more suitable for medical image synthesis tasks. Experiments were conducted on the IXI (a total of 575 samples, training set: 450 samples; validation set: 25 samples; test set: 100 samples) and BRATS (a total of 494 samples, training set: 350 samples; validation set: 44 samples; test set: 100 samples) datasets to assess the synthesis performance of our proposed method in comparison to some state-of-the-art models on the task of multi-contrast MRI synthesis. RESULTS Our proposed model outperformed other state-of-the-art models in some multi-contrast MRI synthesis tasks. In the synthesis task from T1 to PD, our proposed method achieved the peak signal-to-noise ratio (PSNR) of 33.70 dB (95% CI, 33.61, 33.79) and the structural similarity index (SSIM) of 0.966 (95% CI, 0.964, 0.968). In the synthesis task from T2 to PD, the model achieved a PSNR of 33.90 dB (95% CI, 33.82, 33.98) and SSMI of 0.971 (95% CI, 0.969, 0.973). In the synthesis task from FLAIR to T2, the model achieved PSNR of 30.43 dB (95% CI, 30.29, 30.57) and SSIM of 0.938 (95% CI, 0.935, 0.941). CONCLUSIONS Our proposed method could effectively model not only the high-dimensional, nonlinear mapping relationships between the magnetic signals of the hydrogen nucleus in tissues and the proton density signals in tissues, but also of the recovery process of suppressed liquid signals in FLAIR. The model proposed in our work employed distinct mechanisms in the synthesis of images belonging to normal and lesion samples, which demonstrated that our model had a profound comprehension of the input data. We also proved that in a hierarchical network, only the deeper self-attention layers were responsible for directing more attention on lesion areas.
Collapse
Affiliation(s)
- Shuai Chen
- Jiangsu Key Laboratory for Biomaterials and Devices, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Ruoyu Zhang
- Jiangsu Key Laboratory for Biomaterials and Devices, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | - Huazheng Liang
- Monash Suzhou Research Institute, Suzhou, Jiangsu Province, China
- Shanghai Key Laboratory of Anesthesiology and Brain Functional Modulation, Clinical Research Center for Anesthesiology and Perioperative Medicine, Shanghai Fourth People's Hospital, School of Medicine, Translational Research Institute of Brain and Brain-Like Intelligence, Shanghai Fourth People's Hospital, School of Medicine, Tongji University, Shanghai, China
| | - Yunzhu Qian
- Department of Stomatology, The Fourth Affiliated Hospital of Soochow University, Suzhou Dushu Lake Hospital, Medical Center of Soochow University, Suzhou, Jiangsu Province, China
| | - Xuefeng Zhou
- Jiangsu Key Laboratory for Biomaterials and Devices, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| |
Collapse
|
2
|
Tong MW, Zhou J, Akkaya Z, Majumdar S, Bhattacharjee R. Artificial intelligence in musculoskeletal applications: a primer for radiologists. Diagn Interv Radiol 2025; 31:89-101. [PMID: 39157958 PMCID: PMC11880867 DOI: 10.4274/dir.2024.242830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2024] [Accepted: 07/11/2024] [Indexed: 08/20/2024]
Abstract
As an umbrella term, artificial intelligence (AI) covers machine learning and deep learning. This review aimed to elaborate on these terms to act as a primer for radiologists to learn more about the algorithms commonly used in musculoskeletal radiology. It also aimed to familiarize them with the common practices and issues in the use of AI in this domain.
Collapse
Affiliation(s)
- Michelle W. Tong
- University of California San Francisco Department of Radiology and Biomedical Imaging, San Francisco, USA
- University of California San Francisco Department of Bioengineering, San Francisco, USA
- University of California Berkeley Department of Bioengineering, Berkeley, USA
| | - Jiamin Zhou
- University of California San Francisco Department of Orthopaedic Surgery, San Francisco, USA
| | - Zehra Akkaya
- University of California San Francisco Department of Radiology and Biomedical Imaging, San Francisco, USA
- Ankara University Faculty of Medicine Department of Radiology, Ankara, Türkiye
| | - Sharmila Majumdar
- University of California San Francisco Department of Radiology and Biomedical Imaging, San Francisco, USA
- University of California San Francisco Department of Bioengineering, San Francisco, USA
| | - Rupsa Bhattacharjee
- University of California San Francisco Department of Radiology and Biomedical Imaging, San Francisco, USA
| |
Collapse
|
3
|
Dong Y, Wang P, Geng H, Liu Y, Wang E. Ultrasound and advanced imaging techniques in prostate cancer diagnosis: A comparative study of mpMRI, TRUS, and PET/CT. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2025; 33:436-447. [PMID: 39973788 DOI: 10.1177/08953996241304988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
ObjectiveThis study aims to assess and compare the diagnostic performance of three advanced imaging modalities-multiparametric magnetic resonance imaging (mpMRI), transrectal ultrasound (TRUS), and positron emission tomography/computed tomography (PET/CT)-in detecting prostate cancer in patients with elevated PSA levels and abnormal DRE findings.MethodsA retrospective analysis was conducted on 150 male patients aged 50-75 years with elevated PSA and abnormal DRE. The diagnostic accuracy of each modality was assessed through sensitivity, specificity, and the area under the curve (AUC) to compare performance in detecting clinically significant prostate cancer (Gleason score ≥ 7).ResultsMpMRI demonstrated the highest diagnostic performance, with a sensitivity of 90%, specificity of 85%, and AUC of 0.92, outperforming both TRUS (sensitivity 76%, specificity 78%, AUC 0.77) and PET/CT (sensitivity 82%, specificity 80%, AUC 0.81). MpMRI detected clinically significant tumors in 80% of cases. Although TRUS and PET/CT had similar detection rates for significant tumors, their overall accuracy was lower. Minor adverse events occurred in 5% of patients undergoing TRUS, while no significant complications were associated with mpMRI or PET/CT.ConclusionThese findings suggest that mpMRI is the most reliable imaging modality for early detection of clinically significant prostate cancer. It reduces the need for unnecessary biopsies and optimizes patient management.
Collapse
Affiliation(s)
- Ying Dong
- Department of Radiology, Beijing Renhe Hospital, Beijing, China
| | - Peng Wang
- Department of Imaging Diagnostic, Binzhou Hospital of Traditional Chinese Medicine, Binzhou City, China
| | - Hua Geng
- Department of Oncology, Binzhou Hospital of Traditional Chinese Medicine, Binzhou City, China
| | - Yankun Liu
- Department of Medical Imaging Center, Central Hospital Afffliated to Shandong First Medical University, Jinan City, China
| | - Enguo Wang
- Department of Medical Imaging Center, Central Hospital Afffliated to Shandong First Medical University, Jinan City, China
| |
Collapse
|
4
|
D N S, Pai RM, Bhat SN, Pai M M M. Assessment of perceived realism in AI-generated synthetic spine fracture CT images. Technol Health Care 2025; 33:931-944. [PMID: 40105176 DOI: 10.1177/09287329241291368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/20/2025]
Abstract
BackgroundDeep learning-based decision support systems require synthetic images generated by adversarial networks, which require clinical evaluation to ensure their quality.ObjectiveThe study evaluates perceived realism of high-dimension synthetic spine fracture CT images generated Progressive Growing Generative Adversarial Networks (PGGANs).Method: The study used 2820 spine fracture CT images from 456 patients to train an PGGAN model. The model synthesized images up to 512 × 512 pixels, and the realism of the generated images was assessed using Visual Turing Tests and Fracture Identification Test. Three spine surgeons evaluated the images, and clinical evaluation results were statistically analysed.Result: Spine surgeons have an average prediction accuracy of nearly 50% during clinical evaluations, indicating difficulty in distinguishing between real and generated images. The accuracy varies for different dimensions, with synthetic images being more realistic, especially in 512 × 512-dimension images. During FIT, among 16 generated images of each fracture type, 13-15 images were correctly identified, indicating images are more realistic and clearly depict fracture lines in 512 × 512 dimensions.ConclusionThe study reveals that AI-based PGGAN can generate realistic synthetic spine fracture CT images up to 512 × 512 pixels, making them difficult to distinguish from real images, and improving the automatic spine fracture type detection system.
Collapse
Affiliation(s)
- Sindhura D N
- Department of Data Science and Computer Applications, Manipal Institute of Technology, Manipal, Manipal Academy of Higher Education, Manipal, India
| | - Radhika M Pai
- Department of Data Science and Computer Applications, Manipal Institute of Technology, Manipal, Manipal Academy of Higher Education, Manipal, India
| | - Shyamasunder N Bhat
- Department of Orthopaedics, Kasturba Medical College, Manipal, Manipal Academy of Higher Education, Manipal, India
| | - Manohara Pai M M
- Department of Information and Communication Technology, Manipal Institute of Technology, Manipal, Manipal Academy of Higher Education, Manipal, India
| |
Collapse
|
5
|
Chu L, Ma B, Dong X, He Y, Che T, Zeng D, Zhang Z, Li S. A paired dataset of multi-modal MRI at 3 Tesla and 7 Tesla with manual hippocampal subfield segmentations. Sci Data 2025; 12:260. [PMID: 39948093 PMCID: PMC11825668 DOI: 10.1038/s41597-025-04586-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Accepted: 02/04/2025] [Indexed: 02/16/2025] Open
Abstract
The hippocampus plays a critical role in memory and is prone to neural degenerative diseases. Its complex structure and distinct subfields pose challenges for automatic segmentation in 3 T MRI because of its limited resolution and contrast. While 7 T MRI offers superior anatomical details and better gray-white matter contrast, aiding in clearer differentiation of hippocampal structures, its use is restricted by high costs. To bridge this gap, algorithms synthesizing 7T-like images from 3 T scans are being developed, requiring paired datasets for training. However, the scarcity of such high-quality paired datasets, particularly those with manual hippocampal subfield segmentations as ground truth, hinders progress. Herein, we introduce a dataset comprising paired 3 T and 7 T MRI scans from 20 healthy volunteers, with manual hippocampal subfield annotations on 7 T T2-weighted images. This dataset is designed to support the development and evaluation of both 3T-to-7T MR image synthesis models and automated hippocampal segmentation algorithms on 3 T images. We assessed the image quality using MRIQC. The dataset is freely accessible on the Figshare+.
Collapse
Affiliation(s)
- Lei Chu
- Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science & Medical Engineering, Beihang University, Beijing, 100083, China
| | - Baoqiang Ma
- Department of Radiation Oncology, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
| | - Xiaoxi Dong
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, 100875, China
| | - Yirong He
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, 100875, China
| | - Tongtong Che
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, 100875, China
| | - Debin Zeng
- Beijing Advanced Innovation Center for Biomedical Engineering, School of Biological Science & Medical Engineering, Beihang University, Beijing, 100083, China
| | - Zihao Zhang
- State Key Laboratory of Brain and Cognitive Science, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- Anhui Province Key Laboratory of Biomedical Imaging and Intelligent Processing, Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, 230088, China
- University of the Chinese Academy of Sciences, Beijing, 100049, China
| | - Shuyu Li
- State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing, 100875, China.
| |
Collapse
|
6
|
McCoy JA, Levine LD, Wan G, Chivers C, Teel J, La Cava WG. Intrapartum electronic fetal heart rate monitoring to predict acidemia at birth with the use of deep learning. Am J Obstet Gynecol 2025; 232:116.e1-116.e9. [PMID: 38663662 PMCID: PMC11499302 DOI: 10.1016/j.ajog.2024.04.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 04/17/2024] [Accepted: 04/18/2024] [Indexed: 05/01/2024]
Abstract
BACKGROUND Electronic fetal monitoring is used in most US hospital births but has significant limitations in achieving its intended goal of preventing intrapartum hypoxic-ischemic injury. Novel deep learning techniques can improve complex data processing and pattern recognition in medicine. OBJECTIVE This study aimed to apply deep learning approaches to develop and validate a model to predict fetal acidemia from electronic fetal monitoring data. STUDY DESIGN The database was created using intrapartum electronic fetal monitoring data from 2006 to 2020 from a large, multisite academic health system. Data were divided into training and testing sets with equal distribution of acidemic cases. Several different deep learning architectures were explored. The primary outcome was umbilical artery acidemia, which was investigated at 4 clinically meaningful thresholds: 7.20, 7.15, 7.10, and 7.05, along with base excess. The receiver operating characteristic curves were generated with the area under the receiver operating characteristic assessed to determine the performance of the models. External validation was performed using a publicly available Czech database of electronic fetal monitoring data. RESULTS A total of 124,777 electronic fetal monitoring files were available, of which 77,132 had <30% missingness in the last 60 minutes of the electronic fetal monitoring tracing. Of these, 21,041 were matched to a corresponding umbilical cord gas result, of which 10,182 were time-stamped within 30 minutes of the last electronic fetal monitoring reading and composed the final dataset. The prevalence rates of the outcomes in the data were 20.9% with a pH of <7.2, 9.1% with a pH of <7.15, 3.3% with a pH of <7.10, and 1.3% with a pH of <7.05. The best performing model achieved an area under the receiver operating characteristic of 0.85 at a pH threshold of <7.05. When predicting the joint outcome of both pH of <7.05 and base excess of less than -10 meq/L, an area under the receiver operating characteristic of 0.89 was achieved. When predicting both pH of <7.20 and base excess of less than -10 meq/L, an area under the receiver operating characteristic of 0.87 was achieved. At a pH of <7.15 and a positive predictive value of 30%, the model achieved a sensitivity of 90% and a specificity of 48%. CONCLUSION The application of deep learning methods to intrapartum electronic fetal monitoring analysis achieves promising performance in predicting fetal acidemia. This technology could help improve the accuracy and consistency of electronic fetal monitoring interpretation.
Collapse
Affiliation(s)
- Jennifer A McCoy
- Maternal Fetal Medicine Research Program, Department of Obstetrics and Gynecology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA.
| | - Lisa D Levine
- Maternal Fetal Medicine Research Program, Department of Obstetrics and Gynecology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| | - Guangya Wan
- School of Data Science, University of Virginia, Charlottesville, VA
| | | | - Joseph Teel
- Department of Family Medicine and Community Health, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| | - William G La Cava
- Computational Health Informatics Program, Department of Pediatrics, Boston Children's Hospital, Harvard Medical School, Boston, MA
| |
Collapse
|
7
|
Suero Molina E, Tabassum M, Azemi G, Özdemir Z, Roll W, Backhaus P, Schindler P, Valls Chavarria A, Russo C, Liu S, Stummer W, Di Ieva A. Synthetic O-(2- 18F-fluoroethyl)-l-tyrosine-positron emission tomography generation and hotspot prediction via preoperative MRI fusion of gliomas lacking radiographic high-grade characteristics. Neurooncol Adv 2025; 7:vdaf001. [PMID: 40264944 PMCID: PMC12012690 DOI: 10.1093/noajnl/vdaf001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/24/2025] Open
Abstract
Background Limited amino acid availability for positron emission tomography (PET) imaging hinders therapeutic decision-making for gliomas without typical high-grade imaging features. To address this gap, we evaluated a generative artificial intelligence (AI) approach for creating synthetic O-(2-18F-fluoroethyl)-l-tyrosine ([18F]FET)-PET and predicting high [18F]FET uptake from magnetic resonance imaging (MRI). Methods We trained a deep learning (DL)-based model to segment tumors in MRI, extracted radiomic features using the Python PyRadiomics package, and utilized a Random Forest classifier to predict high [18F]FET uptake. To generate [18F]FET-PET images, we employed a generative adversarial network framework and utilized a split-input fusion module for processing different MRI sequences through feature extraction, concatenation, and self-attention. Results We included magnetic resonance imaging (MRI) and PET images from 215 studies for the hotspot classification and 211 studies for the synthetic PET generation task. The top-performing radiomic features achieved 80% accuracy for hotspot prediction. From the synthetic [18F]FET-PET, 85% were classified as clinically useful by senior physicians. Peak signal-to-noise ratio analysis indicated high signal fidelity with a peak at 40 dB, while structural similarity index values showed structural congruence. Root mean square error analysis demonstrated lower values below 5.6. Most visual information fidelity scores ranged between 0.6 and 0.7. This indicates that synthetic PET images retain the essential information required for clinical assessment and diagnosis. Conclusion For the first time, we demonstrate that predicting high [18F]FET uptake and generating synthetic PET images from preoperative MRI in lower-grade and high-grade glioma are feasible. Advanced MRI modalities and other generative AI models will be used to improve the algorithm further in future studies.
Collapse
Affiliation(s)
- Eric Suero Molina
- Macquarie Neurosurgery & Spine, Macquarie University Hospital, Sydney, NSW, Australia
- Computational NeuroSurgery (CNS) Lab, Macquarie Medical School, Macquarie University, Sydney, NSW, Australia
- Department of Neurosurgery, University Hospital Münster, Münster, Germany
| | - Mehnaz Tabassum
- Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, NSW, Australia
- Computational NeuroSurgery (CNS) Lab, Macquarie Medical School, Macquarie University, Sydney, NSW, Australia
| | - Ghasem Azemi
- Computational NeuroSurgery (CNS) Lab, Macquarie Medical School, Macquarie University, Sydney, NSW, Australia
| | - Zeynep Özdemir
- Department of Neurosurgery, University Hospital Münster, Münster, Germany
| | - Wolfgang Roll
- Department of Nuclear Medicine, University Hospital Münster, Münster, Germany
| | - Philipp Backhaus
- Department of Nuclear Medicine, University Hospital Münster, Münster, Germany
| | | | | | - Carlo Russo
- Computational NeuroSurgery (CNS) Lab, Macquarie Medical School, Macquarie University, Sydney, NSW, Australia
| | - Sidong Liu
- Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, NSW, Australia
- Computational NeuroSurgery (CNS) Lab, Macquarie Medical School, Macquarie University, Sydney, NSW, Australia
| | - Walter Stummer
- Department of Neurosurgery, University Hospital Münster, Münster, Germany
| | - Antonio Di Ieva
- Department of Neurosurgery, Nepean Blue Mountains Local Health District, Kingswood, NSW, Australia
- Macquarie Neurosurgery & Spine, Macquarie University Hospital, Sydney, NSW, Australia
- Computational NeuroSurgery (CNS) Lab, Macquarie Medical School, Macquarie University, Sydney, NSW, Australia
| |
Collapse
|
8
|
Zhang Y, Peng C, Wang Q, Song D, Li K, Kevin Zhou S. Unified Multi-Modal Image Synthesis for Missing Modality Imputation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:4-18. [PMID: 38976465 DOI: 10.1109/tmi.2024.3424785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Multi-modal medical images provide complementary soft-tissue characteristics that aid in the screening and diagnosis of diseases. However, limited scanning time, image corruption and various imaging protocols often result in incomplete multi-modal images, thus limiting the usage of multi-modal data for clinical purposes. To address this issue, in this paper, we propose a novel unified multi-modal image synthesis method for missing modality imputation. Our method overall takes a generative adversarial architecture, which aims to synthesize missing modalities from any combination of available ones with a single model. To this end, we specifically design a Commonality- and Discrepancy-Sensitive Encoder for the generator to exploit both modality-invariant and specific information contained in input modalities. The incorporation of both types of information facilitates the generation of images with consistent anatomy and realistic details of the desired distribution. Besides, we propose a Dynamic Feature Unification Module to integrate information from a varying number of available modalities, which enables the network to be robust to random missing modalities. The module performs both hard integration and soft integration, ensuring the effectiveness of feature combination while avoiding information loss. Verified on two public multi-modal magnetic resonance datasets, the proposed method is effective in handling various synthesis tasks and shows superior performance compared to previous methods.
Collapse
|
9
|
Luo Y, Yang Q, Fan Y, Qi H, Xia M. Measurement Guidance in Diffusion Models: Insight from Medical Image Synthesis. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:7983-7997. [PMID: 38743550 DOI: 10.1109/tpami.2024.3399098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
In the field of healthcare, the acquisition of sample is usually restricted by multiple considerations, including cost, labor- intensive annotation, privacy concerns, and radiation hazards, therefore, synthesizing images-of-interest is an important tool to data augmentation. Diffusion models have recently attained state-of-the-art results in various synthesis tasks, and embedding energy functions has been proved that can effectively guide the pre-trained model to synthesize target samples. However, we notice that current method development and validation are still limited to improving indicators, such as Fréchet Inception Distance score (FID) and Inception Score (IS), and have not provided deeper investigations on downstream tasks, like disease grading and diagnosis. Moreover, existing classifier guidance which can be regarded as a special case of energy function can only has a singular effect on altering the distribution of the synthetic dataset. This may contribute to in-distribution synthetic sample that has limited help to downstream model optimization. All these limitations remind that we still have a long way to go to achieve controllable generation. In this work, we first conducted an analysis on previous guidance as well as its contributions on further applications from the perspective of data distribution. To synthesize samples which can help downstream applications, we then introduce uncertainty guidance in each sampling step and design an uncertainty-guided diffusion models. Extensive experiments on four medical datasets, with ten classic networks trained on the augmented sample sets provided a comprehensive evaluation on the practical contributions of our methodology. Furthermore, we provide a theoretical guarantee for general gradient guidance in diffusion models, which would benefit future research on investigating other forms of measurement guidance for specific generative tasks.
Collapse
|
10
|
Zhou Y, Chen T, Hou J, Xie H, Dvornek NC, Zhou SK, Wilson DL, Duncan JS, Liu C, Zhou B. Cascaded Multi-path Shortcut Diffusion Model for Medical Image Translation. Med Image Anal 2024; 98:103300. [PMID: 39226710 PMCID: PMC11979896 DOI: 10.1016/j.media.2024.103300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Revised: 05/29/2024] [Accepted: 08/06/2024] [Indexed: 09/05/2024]
Abstract
Image-to-image translation is a vital component in medical imaging processing, with many uses in a wide range of imaging modalities and clinical scenarios. Previous methods include Generative Adversarial Networks (GANs) and Diffusion Models (DMs), which offer realism but suffer from instability and lack uncertainty estimation. Even though both GAN and DM methods have individually exhibited their capability in medical image translation tasks, the potential of combining a GAN and DM to further improve translation performance and to enable uncertainty estimation remains largely unexplored. In this work, we address these challenges by proposing a Cascade Multi-path Shortcut Diffusion Model (CMDM) for high-quality medical image translation and uncertainty estimation. To reduce the required number of iterations and ensure robust performance, our method first obtains a conditional GAN-generated prior image that will be used for the efficient reverse translation with a DM in the subsequent step. Additionally, a multi-path shortcut diffusion strategy is employed to refine translation results and estimate uncertainty. A cascaded pipeline further enhances translation quality, incorporating residual averaging between cascades. We collected three different medical image datasets with two sub-tasks for each dataset to test the generalizability of our approach. Our experimental results found that CMDM can produce high-quality translations comparable to state-of-the-art methods while providing reasonable uncertainty estimations that correlate well with the translation error.
Collapse
Affiliation(s)
- Yinchi Zhou
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA.
| | - Tianqi Chen
- Department of Computer Science, University of California Irvine, Irvine, CA, USA
| | - Jun Hou
- Department of Computer Science, University of California Irvine, Irvine, CA, USA
| | - Huidong Xie
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA
| | - Nicha C Dvornek
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA; Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, USA
| | - S Kevin Zhou
- School of Biomedical Engineering & Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, China
| | - David L Wilson
- Department of Biomedical Engineering, Case Western Reserve University, Cleveland, OH, USA
| | - James S Duncan
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA; Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, USA; Department of Electrical Engineering, Yale University, New Haven, CT, USA
| | - Chi Liu
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA; Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, CT, USA
| | - Bo Zhou
- Department of Radiology, Northwestern University, Chicago, IL, USA.
| |
Collapse
|
11
|
Koike Y, Takegawa H, Anetai Y, Nakamura S, Yoshida K, Yoshida A, Yui M, Hirota K, Ueda K, Tanigawa N. Cone-Beam CT to CT Image Translation Using a Transformer-Based Deep Learning Model for Prostate Cancer Adaptive Radiotherapy. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024:10.1007/s10278-024-01312-6. [PMID: 39511015 DOI: 10.1007/s10278-024-01312-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 10/15/2024] [Accepted: 10/17/2024] [Indexed: 11/15/2024]
Abstract
Cone-beam computed tomography (CBCT) is widely utilized in image-guided radiation therapy; however, its image quality is poor compared to planning CT (pCT), thus restricting its utility for adaptive radiotherapy (ART). Our objective was to enhance CBCT image quality utilizing a transformer-based deep learning model, SwinUNETR, which we compared with a conventional convolutional neural network (CNN) model, U-net. This retrospective study involved 260 patients undergoing prostate radiotherapy, with 245 patients used for training and 15 patients reserved as an independent hold-out test dataset. Employing a CycleGAN framework, we generated synthetic CT (sCT) images from CBCT images, employing SwinUNETR and U-net as generators. We evaluated sCT image quality and assessed its dosimetric impact for photon therapy through gamma analysis and dose-volume histogram (DVH) comparisons. The mean absolute error values for the CT numbers, calculated using all voxels within the patient's body contour and taking the pCT images as a reference, were 84.07, 73.49, and 64.69 Hounsfield units for CBCT, U-net, and SwinUNETR images, respectively. Gamma analysis revealed superior agreement between the dose on the pCT images and on the SwinUNETR-based sCT plans compared to those based on U-net. DVH parameters calculated on the SwinUNETR-based sCT deviated by < 1% from those in pCT plans. Our study showed that, compared to the U-net model, SwinUNETR could proficiently generate more precise sCT images from CBCT images, facilitating more accurate dose calculations. This study demonstrates the superiority of transformer-based models over conventional CNN-based approaches for CBCT-to-CT translation, contributing to the advancement of image synthesis techniques in ART.
Collapse
Affiliation(s)
- Yuhei Koike
- Department of Radiology, Kansai Medical University, 2-5-1 Shinmachi, Hirakata, Osaka, 573-1010, Japan.
- Division of Radiation Oncology, Kansai Medical University Hospital, 2-3-1 Shinmachi, Hirakata, Osaka, 573-1191, Japan.
| | - Hideki Takegawa
- Department of Radiology, Kansai Medical University, 2-5-1 Shinmachi, Hirakata, Osaka, 573-1010, Japan
- Division of Radiation Oncology, Kansai Medical University Hospital, 2-3-1 Shinmachi, Hirakata, Osaka, 573-1191, Japan
| | - Yusuke Anetai
- Department of Radiology, Kansai Medical University, 2-5-1 Shinmachi, Hirakata, Osaka, 573-1010, Japan
- Division of Radiation Oncology, Kansai Medical University Hospital, 2-3-1 Shinmachi, Hirakata, Osaka, 573-1191, Japan
| | - Satoaki Nakamura
- Department of Radiology, Kansai Medical University, 2-5-1 Shinmachi, Hirakata, Osaka, 573-1010, Japan
- Division of Radiation Oncology, Kansai Medical University Hospital, 2-3-1 Shinmachi, Hirakata, Osaka, 573-1191, Japan
| | - Ken Yoshida
- Department of Radiology, Kansai Medical University, 2-5-1 Shinmachi, Hirakata, Osaka, 573-1010, Japan
- Division of Radiation Oncology, Kansai Medical University Medical Center, 10-15 Fumizono-Cho, Moriguchi, Osaka, 570-8507, Japan
| | - Asami Yoshida
- Department of Radiology, Kansai Medical University, 2-5-1 Shinmachi, Hirakata, Osaka, 573-1010, Japan
- Division of Radiation Oncology, Kansai Medical University Hospital, 2-3-1 Shinmachi, Hirakata, Osaka, 573-1191, Japan
| | - Midori Yui
- Department of Radiology, Kansai Medical University, 2-5-1 Shinmachi, Hirakata, Osaka, 573-1010, Japan
- Division of Radiation Oncology, Kansai Medical University Hospital, 2-3-1 Shinmachi, Hirakata, Osaka, 573-1191, Japan
| | - Kazuki Hirota
- Department of Radiology, Kansai Medical University, 2-5-1 Shinmachi, Hirakata, Osaka, 573-1010, Japan
- Division of Radiation Oncology, Kansai Medical University Hospital, 2-3-1 Shinmachi, Hirakata, Osaka, 573-1191, Japan
| | - Kenichi Ueda
- Department of Radiology, Kansai Medical University, 2-5-1 Shinmachi, Hirakata, Osaka, 573-1010, Japan
- Division of Radiation Oncology, Kansai Medical University Hospital, 2-3-1 Shinmachi, Hirakata, Osaka, 573-1191, Japan
| | - Noboru Tanigawa
- Department of Radiology, Kansai Medical University, 2-5-1 Shinmachi, Hirakata, Osaka, 573-1010, Japan
| |
Collapse
|
12
|
Peng H, Liu T, Li P, Yang F, Luo X, Sun X, Gao D, Lin F, Jia L, Xu N, Tan H, Wang X, Ren T. Automatic delineation of cervical cancer target volumes in small samples based on multi-decoder and semi-supervised learning and clinical application. Sci Rep 2024; 14:26937. [PMID: 39505991 PMCID: PMC11542092 DOI: 10.1038/s41598-024-78424-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Accepted: 10/30/2024] [Indexed: 11/08/2024] Open
Abstract
Radiotherapy has been demonstrated to be one of the most significant treatments for cervical cancer, during which accurate and efficient delineation of target volumes is critical. To alleviate the data demand of deep learning and promote the establishment and promotion of auto-segmentation models in small and medium-sized oncology departments and single centres, we proposed an auto-segmentation algorithm to determine the cervical cancer target volume in small samples based on multi-decoder and semi-supervised learning (MDSSL), and we evaluated the accuracy via an independent test cohort. In this study, we retrospectively collected computed tomography (CT) datasets from 71 pelvic cervical cancer patients, and a 3:4 ratio was used for the training and testing sets. The clinical target volumes (CTVs) of the primary tumour area (CTV1) and pelvic lymph drainage area (CTV2) were delineated. For definitive radiotherapy (dRT), the primary gross target volume (GTVp) was simultaneously delineated. According to the data characteristics for small samples, the MDSSL network structure based on 3D U-Net was established to train the model by combining clinical anatomical information, which was compared with other segmentation methods, including supervised learning (SL) and transfer learning (TL). The dice similarity coefficient (DSC), 95% Hausdorff distance (HD95) and average surface distance (ASD) were used to evaluate the segmentation performance. The ability of the segmentation algorithm to improve the efficiency of online adaptive radiation therapy (ART) was assessed via geometric indicators and a subjective evaluation of radiation oncologists (ROs) in prospective clinical applications. Compared with the SL model and TL model, the proposed MDSSL model displayed the best DSC, HD95 and ASD overall, especially for the GTVp of dRT. We calculated the above geometric indicators in the range of the ground truth (head-foot direction). In the test set, the DSC, HD95 and ASD of the MDSSL model were 0.80/5.85 mm/0.95 mm for CTV1 of post-operative radiotherapy (pRT), 0.84/ 4.88 mm/0.73 mm for CTV2 of pRT, 0.84/6.58 mm/0.89 mm for GTVp of dRT, 0.85/5.36 mm/1.35 mm for CTV1 of dRT, and 0.84/4.09 mm/0.73 mm for CTV2 of dRT, respectively. In a prospective clinical study of online ART, the target volume modification time (MTime) was 3-5 min for dRT and 2-4 min for pRT, and the main duration of CTV1 modification was approximately 2 min. The introduction of the MDSSL method successfully improved the accuracy of auto-segmentation for the cervical cancer target volume in small samples, showed good consistency with RO delineation and satisfied clinical requirements. In this prospective online ART study, the application of the segmentation model was demonstrated to be useful for reducing the target volume delineation time and improving the efficiency of the online ART workflow, which can contribute to the development and promotion of cervical cancer online ART.
Collapse
Affiliation(s)
- Haibo Peng
- Oncology Department, Clinical Medical College, The First Affiliated Hospital of Chengdu Medical College, Chengdu, 610500, China
- Clinical Key Speciality (Oncology Department) of Sichuan Province, Chengdu, 610500, China
| | - Tao Liu
- Oncology Department, Clinical Medical College, The First Affiliated Hospital of Chengdu Medical College, Chengdu, 610500, China
- Radiology and Therapy Clinical Medical Research Center of Sichuan Province, Chengdu, 610500, China
| | - Pengcheng Li
- Oncology Department, Clinical Medical College, The First Affiliated Hospital of Chengdu Medical College, Chengdu, 610500, China
| | - Fang Yang
- Oncology Department, Clinical Medical College, The First Affiliated Hospital of Chengdu Medical College, Chengdu, 610500, China
| | - Xing Luo
- Oncology Department, Clinical Medical College, The First Affiliated Hospital of Chengdu Medical College, Chengdu, 610500, China
| | - Xiaoqing Sun
- Radiotherapy Laboratory, Shenzhen United Imaging Research Institute of Innovative Medical Equipment, Shenzhen, 518048, China
| | - Dong Gao
- United Imaging Central Research Institute Co., Ltd, Shanghai, 201807, China
| | - Fengyu Lin
- Radiotherapy Laboratory, Shenzhen United Imaging Research Institute of Innovative Medical Equipment, Shenzhen, 518048, China
| | - Lecheng Jia
- Radiotherapy Laboratory, Shenzhen United Imaging Research Institute of Innovative Medical Equipment, Shenzhen, 518048, China
- Zhejiang Engineering Research Center for Innovation and Application of Intelligent Radiotherapy Technology, Wenzhou, 325000, China
| | - Ningyue Xu
- Oncology Department, Clinical Medical College, The First Affiliated Hospital of Chengdu Medical College, Chengdu, 610500, China
| | - Huigang Tan
- Oncology Department, Clinical Medical College, The First Affiliated Hospital of Chengdu Medical College, Chengdu, 610500, China
| | - Xi Wang
- Department of Ultrasound, The General Hospital of Western Theater Command, Chengdu, 610083, China.
| | - Tao Ren
- Oncology Department, Clinical Medical College, The First Affiliated Hospital of Chengdu Medical College, Chengdu, 610500, China.
- Oncology Department, The First Affiliated Hospital of Traditional Chinese Medical of Chengdu Medical College, Xindu Hospital of Traditional Chinese Medical, Chengdu, 610500, China.
| |
Collapse
|
13
|
Zhang R, Du X, Li H. Application and performance enhancement of FAIMS spectral data for deep learning analysis using generative adversarial network reinforcement. Anal Biochem 2024; 694:115627. [PMID: 39033946 DOI: 10.1016/j.ab.2024.115627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Revised: 06/21/2024] [Accepted: 07/18/2024] [Indexed: 07/23/2024]
Abstract
When using High-field asymmetric ion mobility spectrometry (FAIMS) to process complex mixtures for deep learning analysis, there is a problem of poor recognition performance due to the lack of high-quality data and low sample diversity. In this paper, a Generative Adversarial Network (GAN) method is introduced to simulate and generate highly realistic and diverse spectral for expanding the dataset using real mixture spectral data of 15 classes collected by FAIMS. The mixed datasets were put into VGG and ResNeXt for testing respectively, and the experimental results proved that the best recognition effect was achieved when the ratio of real data to generated data was 1:4: where accuracy improved by 24.19 % and 6.43 %; precision improved by 23.71 % and 6.97 %; recall improved by 21.08 % and 7.09 %; and F1-score improved by 24.50 % and 8.23 %. The above results strongly demonstrate that GAN can effectively expand the data volume and increase the sample diversity without increasing the additional experimental cost, which significantly enhances the experimental effect of FAIMS spectral for the analysis of complex mixtures.
Collapse
Affiliation(s)
- Ruilong Zhang
- School of Life and Environmental Sciences, GuiLin University of Electronic Technology, GuiLin, 541004, China
| | - Xiaoxia Du
- School of Life and Environmental Sciences, GuiLin University of Electronic Technology, GuiLin, 541004, China.
| | - Hua Li
- School of Life and Environmental Sciences, GuiLin University of Electronic Technology, GuiLin, 541004, China.
| |
Collapse
|
14
|
Bahloul MA, Jabeen S, Benoumhani S, Alsaleh HA, Belkhatir Z, Al‐Wabil A. Advancements in synthetic CT generation from MRI: A review of techniques, and trends in radiation therapy planning. J Appl Clin Med Phys 2024; 25:e14499. [PMID: 39325781 PMCID: PMC11539972 DOI: 10.1002/acm2.14499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 06/27/2024] [Accepted: 07/26/2024] [Indexed: 09/28/2024] Open
Abstract
BACKGROUND Magnetic resonance imaging (MRI) and Computed tomography (CT) are crucial imaging techniques in both diagnostic imaging and radiation therapy. MRI provides excellent soft tissue contrast but lacks the direct electron density data needed to calculate dosage. CT, on the other hand, remains the gold standard due to its accurate electron density information in radiation therapy planning (RTP) but it exposes patients to ionizing radiation. Synthetic CT (sCT) generation from MRI has been a focused study field in the last few years due to cost effectiveness as well as for the objective of minimizing side-effects of using more than one imaging modality for treatment simulation. It offers significant time and cost efficiencies, bypassing the complexities of co-registration, and potentially improving treatment accuracy by minimizing registration-related errors. In an effort to navigate the quickly developing field of precision medicine, this paper investigates recent advancements in sCT generation techniques, particularly those using machine learning (ML) and deep learning (DL). The review highlights the potential of these techniques to improve the efficiency and accuracy of sCT generation for use in RTP by improving patient care and reducing healthcare costs. The intricate web of sCT generation techniques is scrutinized critically, with clinical implications and technical underpinnings for enhanced patient care revealed. PURPOSE This review aims to provide an overview of the most recent advancements in sCT generation from MRI with a particular focus of its use within RTP, emphasizing on techniques, performance evaluation, clinical applications, future research trends and open challenges in the field. METHODS A thorough search strategy was employed to conduct a systematic literature review across major scientific databases. Focusing on the past decade's advancements, this review critically examines emerging approaches introduced from 2013 to 2023 for generating sCT from MRI, providing a comprehensive analysis of their methodologies, ultimately fostering further advancement in the field. This study highlighted significant contributions, identified challenges, and provided an overview of successes within RTP. Classifying the identified approaches, contrasting their advantages and disadvantages, and identifying broad trends were all part of the review's synthesis process. RESULTS The review identifies various sCT generation approaches, consisting atlas-based, segmentation-based, multi-modal fusion, hybrid approaches, ML and DL-based techniques. These approaches are evaluated for image quality, dosimetric accuracy, and clinical acceptability. They are used for MRI-only radiation treatment, adaptive radiotherapy, and MR/PET attenuation correction. The review also highlights the diversity of methodologies for sCT generation, each with its own advantages and limitations. Emerging trends incorporate the integration of advanced imaging modalities including various MRI sequences like Dixon sequences, T1-weighted (T1W), T2-weighted (T2W), as well as hybrid approaches for enhanced accuracy. CONCLUSIONS The study examines MRI-based sCT generation, to minimize negative effects of acquiring both modalities. The study reviews 2013-2023 studies on MRI to sCT generation methods, aiming to revolutionize RTP by reducing use of ionizing radiation and improving patient outcomes. The review provides insights for researchers and practitioners, emphasizing the need for standardized validation procedures and collaborative efforts to refine methods and address limitations. It anticipates the continued evolution of techniques to improve the precision of sCT in RTP.
Collapse
Affiliation(s)
- Mohamed A. Bahloul
- College of EngineeringAlfaisal UniversityRiyadhSaudi Arabia
- Translational Biomedical Engineering Research Lab, College of EngineeringAlfaisal UniversityRiyadhSaudi Arabia
| | - Saima Jabeen
- College of EngineeringAlfaisal UniversityRiyadhSaudi Arabia
- Translational Biomedical Engineering Research Lab, College of EngineeringAlfaisal UniversityRiyadhSaudi Arabia
- AI Research Center, College of EngineeringAlfaisal UniversityRiyadhSaudi Arabia
| | - Sara Benoumhani
- College of EngineeringAlfaisal UniversityRiyadhSaudi Arabia
- AI Research Center, College of EngineeringAlfaisal UniversityRiyadhSaudi Arabia
| | | | - Zehor Belkhatir
- School of Electronics and Computer ScienceUniversity of SouthamptonSouthamptonUK
| | - Areej Al‐Wabil
- College of EngineeringAlfaisal UniversityRiyadhSaudi Arabia
- AI Research Center, College of EngineeringAlfaisal UniversityRiyadhSaudi Arabia
| |
Collapse
|
15
|
Chen X, Qiu RL, Peng J, Shelton JW, Chang CW, Yang X, Kesarwala AH. CBCT-based synthetic CT image generation using a diffusion model for CBCT-guided lung radiotherapy. Med Phys 2024; 51:8168-8178. [PMID: 39088750 PMCID: PMC11651384 DOI: 10.1002/mp.17328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 07/01/2024] [Accepted: 07/04/2024] [Indexed: 08/03/2024] Open
Abstract
BACKGROUND Although cone beam computed tomography (CBCT) has lower resolution compared to planning CTs (pCT), its lower dose, higher high-contrast resolution, and shorter scanning time support its widespread use in clinical applications, especially in ensuring accurate patient positioning during the image-guided radiation therapy (IGRT) process. PURPOSE While CBCT is critical to IGRT, CBCT image quality can be compromised by severe stripe and scattering artifacts. Tumor movement secondary to respiratory motion also decreases CBCT resolution. In order to improve the image quality of CBCT, we propose a Lung Diffusion Model (L-DM) framework. METHODS Our proposed algorithm is based on a conditional diffusion model trained on pCT and deformed CBCT (dCBCT) image pairs to synthesize lung CT images from dCBCT images and benefit CBCT-based radiotherapy. dCBCT images were used as the constraint for the L-DM. The image quality and Hounsfield unit (HU) values of the synthetic CTs (sCT) images generated by the proposed L-DM were compared to three selected mainstream generation models. RESULTS We verified our model in both an institutional lung cancer dataset and a selected public dataset. Our L-DM showed significant improvement in the four metrics of mean absolute error (MAE), peak signal-to-noise ratio (PSNR), normalized cross-correlation (NCC), and structural similarity index measure (SSIM). In our institutional dataset, our proposed L-DM decreased the MAE from 101.47 to 37.87 HU and increased the PSNR from 24.97 to 29.89 dB, the NCC from 0.81 to 0.97, and the SSIM from 0.80 to 0.93. In the public dataset, our proposed L-DM decreased the MAE from 173.65 to 58.95 HU, while increasing the PSNR, NCC, and SSIM from 13.07 to 24.05 dB, 0.68 to 0.94, and 0.41 to 0.88, respectively. CONCLUSIONS The proposed L-DM significantly improved sCT image quality compared to the pre-correction CBCT and three mainstream generative models. Our model can benefit CBCT-based IGRT and other potential clinical applications as it increases the HU accuracy and decreases the artifacts from input CBCT images.
Collapse
Affiliation(s)
- Xiaoqian Chen
- Department of Radiation Oncology, Winship Cancer Institute, Emory University School of Medicine, Atlanta, GA, 30332, USA
| | - Richard L.J. Qiu
- Department of Radiation Oncology, Winship Cancer Institute, Emory University School of Medicine, Atlanta, GA, 30332, USA
| | - Junbo Peng
- Department of Radiation Oncology, Winship Cancer Institute, Emory University School of Medicine, Atlanta, GA, 30332, USA
| | - Joseph W. Shelton
- Department of Radiation Oncology, Winship Cancer Institute, Emory University School of Medicine, Atlanta, GA, 30332, USA
| | - Chih-Wei Chang
- Department of Radiation Oncology, Winship Cancer Institute, Emory University School of Medicine, Atlanta, GA, 30332, USA
| | - Xiaofeng Yang
- Department of Radiation Oncology, Winship Cancer Institute, Emory University School of Medicine, Atlanta, GA, 30332, USA
| | - Aparna H. Kesarwala
- Department of Radiation Oncology, Winship Cancer Institute, Emory University School of Medicine, Atlanta, GA, 30332, USA
| |
Collapse
|
16
|
Li X, Bellotti R, Bachtiary B, Hrbacek J, Weber DC, Lomax AJ, Buhmann JM, Zhang Y. A unified generation-registration framework for improved MR-based CT synthesis in proton therapy. Med Phys 2024; 51:8302-8316. [PMID: 39137294 DOI: 10.1002/mp.17338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 06/11/2024] [Accepted: 07/06/2024] [Indexed: 08/15/2024] Open
Abstract
BACKGROUND The use of magnetic resonance (MR) imaging for proton therapy treatment planning is gaining attention as a highly effective method for guidance. At the core of this approach is the generation of computed tomography (CT) images from MR scans. However, the critical issue in this process is accurately aligning the MR and CT images, a task that becomes particularly challenging in frequently moving body areas, such as the head-and-neck. Misalignments in these images can result in blurred synthetic CT (sCT) images, adversely affecting the precision and effectiveness of the treatment planning. PURPOSE This study introduces a novel network that cohesively unifies image generation and registration processes to enhance the quality and anatomical fidelity of sCTs derived from better-aligned MR images. METHODS The approach synergizes a generation network (G) with a deformable registration network (R), optimizing them jointly in MR-to-CT synthesis. This goal is achieved by alternately minimizing the discrepancies between the generated/registered CT images and their corresponding reference CT counterparts. The generation network employs a UNet architecture, while the registration network leverages an implicit neural representation (INR) of the displacement vector fields (DVFs). We validated this method on a dataset comprising 60 head-and-neck patients, reserving 12 cases for holdout testing. RESULTS Compared to the baseline Pix2Pix method with MAE 124.95 ± $\pm$ 30.74 HU, the proposed technique demonstrated 80.98 ± $\pm$ 7.55 HU. The unified translation-registration network produced sharper and more anatomically congruent outputs, showing superior efficacy in converting MR images to sCTs. Additionally, from a dosimetric perspective, the plan recalculated on the resulting sCTs resulted in a remarkably reduced discrepancy to the reference proton plans. CONCLUSIONS This study conclusively demonstrates that a holistic MR-based CT synthesis approach, integrating both image-to-image translation and deformable registration, significantly improves the precision and quality of sCT generation, particularly for the challenging body area with varied anatomic changes between corresponding MR and CT.
Collapse
Affiliation(s)
- Xia Li
- Center for Proton Therapy, Paul Scherrer Institut, Villigen PSI, Switzerland
- Department of Computer Science, ETH Zürich, Zürich, Switzerland
| | - Renato Bellotti
- Center for Proton Therapy, Paul Scherrer Institut, Villigen PSI, Switzerland
- Department of Physics, ETH Zürich, Zürich, Switzerland
| | - Barbara Bachtiary
- Center for Proton Therapy, Paul Scherrer Institut, Villigen PSI, Switzerland
| | - Jan Hrbacek
- Center for Proton Therapy, Paul Scherrer Institut, Villigen PSI, Switzerland
| | - Damien C Weber
- Center for Proton Therapy, Paul Scherrer Institut, Villigen PSI, Switzerland
- Department of Radiation Oncology, University Hospital of Zürich, Zürich, Switzerland
- Department of Radiation Oncology, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
| | - Antony J Lomax
- Center for Proton Therapy, Paul Scherrer Institut, Villigen PSI, Switzerland
- Department of Physics, ETH Zürich, Zürich, Switzerland
| | | | - Ye Zhang
- Center for Proton Therapy, Paul Scherrer Institut, Villigen PSI, Switzerland
| |
Collapse
|
17
|
Diniz E, Santini T, Helmet K, Aizenstein HJ, Ibrahim TS. Cross-modality image translation of 3 Tesla Magnetic Resonance Imaging to 7 Tesla using Generative Adversarial Networks. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.10.16.24315609. [PMID: 39484249 PMCID: PMC11527090 DOI: 10.1101/2024.10.16.24315609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/03/2024]
Abstract
The rapid advancements in magnetic resonance imaging (MRI) technology have precipitated a new paradigm wherein cross-modality data translation across diverse imaging platforms, field strengths, and different sites is increasingly challenging. This issue is particularly accentuated when transitioning from 3 Tesla (3T) to 7 Tesla (7T) MRI systems. This study proposes a novel solution to these challenges using generative adversarial networks (GANs)-specifically, the CycleGAN architecture-to create synthetic 7T images from 3T data. Employing a dataset of 1112 and 490 unpaired 3T and 7T MR images, respectively, we trained a 2-dimensional (2D) CycleGAN model, evaluating its performance on a paired dataset of 22 participants scanned at 3T and 7T. Independent testing on 22 distinct participants affirmed the model's proficiency in accurately predicting various tissue types, encompassing cerebral spinal fluid, gray matter, and white matter. Our approach provides a reliable and efficient methodology for synthesizing 7T images, achieving a median Dice of 6.82%,7,63%, and 4.85% for Cerebral Spinal Fluid (CSF), Gray Matter (GM), and White Matter (WM), respectively, in the testing dataset, thereby significantly aiding in harmonizing heterogeneous datasets. Furthermore, it delineates the potential of GANs in amplifying the contrast-to-noise ratio (CNR) from 3T, potentially enhancing the diagnostic capability of the images. While acknowledging the risk of model overfitting, our research underscores a promising progression towards harnessing the benefits of 7T MR systems in research investigations while preserving compatibility with existent 3T MR data. This work was previously presented at the ISMRM 2021 conference (Diniz, Helmet, Santini, Aizenstein, & Ibrahim, 2021).
Collapse
Affiliation(s)
- Eduardo Diniz
- Department of Electrical and Computer Engineering, University of Pittsburgh, Pennsylvania, United States
| | - Tales Santini
- Department of Bioengineering, University of Pittsburgh, Pennsylvania, United States
| | - Karim Helmet
- Department of Bioengineering, University of Pittsburgh, Pennsylvania, United States
- Department of Psychiatry, University of Pittsburgh, Pennsylvania, United States
| | - Howard J. Aizenstein
- Department of Bioengineering, University of Pittsburgh, Pennsylvania, United States
- Department of Psychiatry, University of Pittsburgh, Pennsylvania, United States
| | - Tamer S. Ibrahim
- Department of Bioengineering, University of Pittsburgh, Pennsylvania, United States
| |
Collapse
|
18
|
Wang R, Heimann AF, Tannast M, Zheng G. CycleSGAN: A cycle-consistent and semantics-preserving generative adversarial network for unpaired MR-to-CT image synthesis. Comput Med Imaging Graph 2024; 117:102431. [PMID: 39243464 DOI: 10.1016/j.compmedimag.2024.102431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2024] [Revised: 08/09/2024] [Accepted: 08/30/2024] [Indexed: 09/09/2024]
Abstract
CycleGAN has been leveraged to synthesize a CT image from an available MR image after trained on unpaired data. Due to the lack of direct constraints between the synthetic and the input images, CycleGAN cannot guarantee structural consistency and often generates inaccurate mappings that shift the anatomy, which is highly undesirable for downstream clinical applications such as MRI-guided radiotherapy treatment planning and PET/MRI attenuation correction. In this paper, we propose a cycle-consistent and semantics-preserving generative adversarial network, referred as CycleSGAN, for unpaired MR-to-CT image synthesis. Our design features a novel and generic way to incorporate semantic information into CycleGAN. This is done by designing a pair of three-player games within the CycleGAN framework where each three-player game consists of one generator and two discriminators to formulate two distinct types of adversarial learning: appearance adversarial learning and structure adversarial learning. These two types of adversarial learning are alternately trained to ensure both realistic image synthesis and semantic structure preservation. Results on unpaired hip MR-to-CT image synthesis show that our method produces better synthetic CT images in both accuracy and visual quality as compared to other state-of-the-art (SOTA) unpaired MR-to-CT image synthesis methods.
Collapse
Affiliation(s)
- Runze Wang
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, No. 800, Dongchuan Road, Shanghai, 200240, China
| | - Alexander F Heimann
- Department of Orthopaedic Surgery, HFR Cantonal Hospital, University of Fribourg, Fribourg, Switzerland
| | - Moritz Tannast
- Department of Orthopaedic Surgery, HFR Cantonal Hospital, University of Fribourg, Fribourg, Switzerland
| | - Guoyan Zheng
- Institute of Medical Robotics, School of Biomedical Engineering, Shanghai Jiao Tong University, No. 800, Dongchuan Road, Shanghai, 200240, China.
| |
Collapse
|
19
|
Zheng B, Zhang R, Diao S, Zhu J, Yuan Y, Cai J, Shao L, Li S, Qin W. Dual domain distribution disruption with semantics preservation: Unsupervised domain adaptation for medical image segmentation. Med Image Anal 2024; 97:103275. [PMID: 39032395 DOI: 10.1016/j.media.2024.103275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 06/14/2024] [Accepted: 07/10/2024] [Indexed: 07/23/2024]
Abstract
Recent unsupervised domain adaptation (UDA) methods in medical image segmentation commonly utilize Generative Adversarial Networks (GANs) for domain translation. However, the translated images often exhibit a distribution deviation from the ideal due to the inherent instability of GANs, leading to challenges such as visual inconsistency and incorrect style, consequently causing the segmentation model to fall into the fixed wrong pattern. To address this problem, we propose a novel UDA framework known as Dual Domain Distribution Disruption with Semantics Preservation (DDSP). Departing from the idea of generating images conforming to the target domain distribution in GAN-based UDA methods, we make the model domain-agnostic and focus on anatomical structural information by leveraging semantic information as constraints to guide the model to adapt to images with disrupted distributions in both source and target domains. Furthermore, we introduce the inter-channel similarity feature alignment based on the domain-invariant structural prior information, which facilitates the shared pixel-wise classifier to achieve robust performance on target domain features by aligning the source and target domain features across channels. Without any exaggeration, our method significantly outperforms existing state-of-the-art UDA methods on three public datasets (i.e., the heart dataset, the brain dataset, and the prostate dataset). The code is available at https://github.com/MIXAILAB/DDSPSeg.
Collapse
Affiliation(s)
- Boyun Zheng
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, China
| | - Ranran Zhang
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Songhui Diao
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, China
| | - Jingke Zhu
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, China
| | - Yixuan Yuan
- Department of Electronic Engineering, The Chinese University of Hong Kong, 999077, Hong Kong, China
| | - Jing Cai
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, 999077, Hong Kong, China
| | - Liang Shao
- Department of Cardiology, Jiangxi Provincial People's Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang 330013, China
| | - Shuo Li
- Department of Biomedical Engineering, Department of Computer and Data Science, Case Western Reserve University, Cleveland, United States.
| | - Wenjian Qin
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China.
| |
Collapse
|
20
|
Zhang J, Cui Z, Jiang C, Guo S, Gao F, Shen D. Hierarchical Organ-Aware Total-Body Standard-Dose PET Reconstruction From Low-Dose PET and CT Images. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:13258-13270. [PMID: 37159324 DOI: 10.1109/tnnls.2023.3266551] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Positron emission tomography (PET) is an important functional imaging technology in early disease diagnosis. Generally, the gamma ray emitted by standard-dose tracer inevitably increases the exposure risk to patients. To reduce dosage, a lower dose tracer is often used and injected into patients. However, this often leads to low-quality PET images. In this article, we propose a learning-based method to reconstruct total-body standard-dose PET (SPET) images from low-dose PET (LPET) images and corresponding total-body computed tomography (CT) images. Different from previous works focusing only on a certain part of human body, our framework can hierarchically reconstruct total-body SPET images, considering varying shapes and intensity distributions of different body parts. Specifically, we first use one global total-body network to coarsely reconstruct total-body SPET images. Then, four local networks are designed to finely reconstruct head-neck, thorax, abdomen-pelvic, and leg parts of human body. Moreover, to enhance each local network learning for the respective local body part, we design an organ-aware network with a residual organ-aware dynamic convolution (RO-DC) module by dynamically adapting organ masks as additional inputs. Extensive experiments on 65 samples collected from uEXPLORER PET/CT system demonstrate that our hierarchical framework can consistently improve the performance of all body parts, especially for total-body PET images with PSNR of 30.6 dB, outperforming the state-of-the-art methods in SPET image reconstruction.
Collapse
|
21
|
Rai HM, Dashkevych S, Yoo J. Next-Generation Diagnostics: The Impact of Synthetic Data Generation on the Detection of Breast Cancer from Ultrasound Imaging. MATHEMATICS 2024; 12:2808. [DOI: 10.3390/math12182808] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2025]
Abstract
Breast cancer is one of the most lethal and widespread diseases affecting women worldwide. As a result, it is necessary to diagnose breast cancer accurately and efficiently utilizing the most cost-effective and widely used methods. In this research, we demonstrated that synthetically created high-quality ultrasound data outperformed conventional augmentation strategies for efficiently diagnosing breast cancer using deep learning. We trained a deep-learning model using the EfficientNet-B7 architecture and a large dataset of 3186 ultrasound images acquired from multiple publicly available sources, as well as 10,000 synthetically generated images using generative adversarial networks (StyleGAN3). The model was trained using five-fold cross-validation techniques and validated using four metrics: accuracy, recall, precision, and the F1 score measure. The results showed that integrating synthetically produced data into the training set increased the classification accuracy from 88.72% to 92.01% based on the F1 score, demonstrating the power of generative models to expand and improve the quality of training datasets in medical-imaging applications. This demonstrated that training the model using a larger set of data comprising synthetic images significantly improved its performance by more than 3% over the genuine dataset with common augmentation. Various data augmentation procedures were also investigated to improve the training set’s diversity and representativeness. This research emphasizes the relevance of using modern artificial intelligence and machine-learning technologies in medical imaging by providing an effective strategy for categorizing ultrasound images, which may lead to increased diagnostic accuracy and optimal treatment options. The proposed techniques are highly promising and have strong potential for future clinical application in the diagnosis of breast cancer.
Collapse
Affiliation(s)
- Hari Mohan Rai
- School of Computing, Gachon University, 1342 Seongnam-daero, Sujeong-gu, Seongnam-si 13120, Gyeonggi-do, Republic of Korea
| | - Serhii Dashkevych
- Department of Computer Engineering, Vistula University, Stokłosy 3, 02-787 Warszawa, Poland
| | - Joon Yoo
- School of Computing, Gachon University, 1342 Seongnam-daero, Sujeong-gu, Seongnam-si 13120, Gyeonggi-do, Republic of Korea
| |
Collapse
|
22
|
Dagommer M, Daneshzand M, Nummemnaa A, Guerin B. Robust deep learning estimation of cortical bone porosity from MR T1-weighted images for individualized transcranial focused ultrasound planning. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.07.18.24310644. [PMID: 39072036 PMCID: PMC11275664 DOI: 10.1101/2024.07.18.24310644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
Objective Transcranial focused ultrasound (tFUS) is an emerging neuromodulation approach that has been demonstrated in animals but is difficult to translate to humans because of acoustic attenuation and scattering in the skull. Optimal dose delivery requires subject-specific skull porosity estimates which has traditionally been done using CT. We propose a deep learning (DL) estimation of skull porosity from T1-weighted MRI images which removes the need for radiation-inducing CT scans. Approach We evaluate the impact of different DL approaches, including network architecture, input size and dimensionality, multichannel inputs, data augmentation, and loss functions. We also propose back-propagation in the mask (BIM), a method whereby only voxels inside the skull mask contribute to training. We evaluate the robustness of the best model to input image noise and MRI acquisition parameters and propagate porosity estimation errors in thousands of beam propagation scenarios. Main results Our best performing model is a cGAN with a ResNet-9 generator with 3D 64×64×64 inputs trained with L1 and L2 losses. The model achieved a mean absolute error of 6.9% in the test set, compared to 9.5% with the pseudo-CT of Izquierdo et al. (38% improvement) and 9.4% with the generic pixel-to-pixel image translation cGAN pix2pix (36% improvement). Acoustic dose distributions in the thalamus were more accurate with our approach than with the pseudo-CT approach of both Burgos et al. and Izquierdo et al, resulting in near-optimal treatment planning and dose estimation at all frequencies compared to CT (reference). Significance Our DL approach porosity estimates with ~7% error, is robust to input image noise and MRI acquisition parameters (sequence, coils, field strength) and yields near-optimal treatment planning and dose estimates for both central (thalamus) and lateral brain targets (amygdala) in the 200-1000 kHz frequency range.
Collapse
Affiliation(s)
- Matthieu Dagommer
- École Supérieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI), Paris France
| | - Mohammad Daneshzand
- Harvard Medical School, Boston MA
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown MA
| | - Aapo Nummemnaa
- Harvard Medical School, Boston MA
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown MA
| | - Bastien Guerin
- Harvard Medical School, Boston MA
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Charlestown MA
| |
Collapse
|
23
|
Chaudhary MFA, Gerard SE, Christensen GE, Cooper CB, Schroeder JD, Hoffman EA, Reinhardt JM. LungViT: Ensembling Cascade of Texture Sensitive Hierarchical Vision Transformers for Cross-Volume Chest CT Image-to-Image Translation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2448-2465. [PMID: 38373126 PMCID: PMC11227912 DOI: 10.1109/tmi.2024.3367321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]
Abstract
Chest computed tomography (CT) at inspiration is often complemented by an expiratory CT to identify peripheral airways disease. Additionally, co-registered inspiratory-expiratory volumes can be used to derive various markers of lung function. Expiratory CT scans, however, may not be acquired due to dose or scan time considerations or may be inadequate due to motion or insufficient exhale; leading to a missed opportunity to evaluate underlying small airways disease. Here, we propose LungViT- a generative adversarial learning approach using hierarchical vision transformers for translating inspiratory CT intensities to corresponding expiratory CT intensities. LungViT addresses several limitations of the traditional generative models including slicewise discontinuities, limited size of generated volumes, and their inability to model texture transfer at volumetric level. We propose a shifted-window hierarchical vision transformer architecture with squeeze-and-excitation decoder blocks for modeling dependencies between features. We also propose a multiview texture similarity distance metric for texture and style transfer in 3D. To incorporate global information into the training process and refine the output of our model, we use ensemble cascading. LungViT is able to generate large 3D volumes of size 320×320×320 . We train and validate our model using a diverse cohort of 1500 subjects with varying disease severity. To assess model generalizability beyond the development set biases, we evaluate our model on an out-of-distribution external validation set of 200 subjects. Clinical validation on internal and external testing sets shows that synthetic volumes could be reliably adopted for deriving clinical endpoints of chronic obstructive pulmonary disease.
Collapse
|
24
|
He J, Ma H, Guo M, Wang J, Wang Z, Fan G. Research into super-resolution in medical imaging from 2000 to 2023: bibliometric analysis and visualization. Quant Imaging Med Surg 2024; 14:5109-5130. [PMID: 39022237 PMCID: PMC11250356 DOI: 10.21037/qims-24-67] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 05/29/2024] [Indexed: 07/20/2024]
Abstract
Background Super-resolution (SR) refers to the use of hardware or software methods to enhance the resolution of low-resolution (LR) images and produce high-resolution (HR) images. SR is applied frequently across a variety of medical imaging contexts, particularly in the enhancement of neuroimaging, with specific techniques including SR microscopy-used for diagnostic biomarkers-and functional magnetic resonance imaging (fMRI)-a neuroimaging method for the measurement and mapping of brain activity. This bibliometric analysis of the literature related to SR in medical imaging was conducted to identify the global trends in this field, and visualization via graphs was completed to offer insights into future research prospects. Methods In order to perform a bibliometric analysis of the SR literature, this study sourced all publications from the Web of Science Core Collection (WoSCC) database published from January 1, 2000, to October 11, 2023. A total of 3,262 articles on SR in medical imaging were evaluated. VOSviewer was used to perform co-occurrence and co-authorship analysis, and network visualization of the literature data, including author, journal, publication year, institution, and keywords, was completed. Results From 2000 to 2023, the annual publication volume surged from 13 to 366. The top three journals in this field in terms of publication volume were as follows: (I) Scientific Reports (86 publications), (II) IEEE Transactions on Medical Imaging (74 publications), and (III) IEEE Transactions on Ultrasonics Ferroelectrics and Frequency Control (56 publications). The most prolific country, institution, and author were the United States (1,017 publications; 31,301 citations), the Chinese Academy of Sciences (124 publications; 2,758 citations), and Dinggang Shen (20 publications; 671 citations), respectively. A cluster analysis of the top 100 keywords was conducted, which revealed the presence of five co-occurrence clusters: (I) SR and artificial intelligence (AI) for medical image enhancement, (II) SR and inverse problem processing concepts for positron emission tomography (PET) image processing, (III) SR ultrasound through microbubbles, (IV) SR microscopy for Alzheimer and Parkinson diseases, and (V) SR in brain fMRI: rapid acquisition and precise imaging. The most recent high-frequency keywords were deep learning (DL), magnetic resonance imaging (MRI), and convolutional neural networks (CNNs). Conclusions Over the past two decades, the output of publications by countries, institutions, and authors in the field of SR in medical imaging has steadily increased. Based on bibliometric analysis of international trends, the resurgence of SR in medical imaging has been facilitated by advancements in AI. The increasing need for multi-center and multi-modal medical images has further incentivized global collaboration, leading to the diverse research paths in SR medical imaging among prominent scientists.
Collapse
Affiliation(s)
- Jiachuan He
- Department of Radiology, the First Hospital of China Medical University, Shenyang, China
| | - He Ma
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Miaoran Guo
- Department of Radiology, the First Hospital of China Medical University, Shenyang, China
| | - Jiaqi Wang
- Department of Radiology, the First Hospital of China Medical University, Shenyang, China
| | - Zhongqing Wang
- Department of Information Center, the First Hospital of China Medical University, Shenyang, China
| | - Guoguang Fan
- Department of Radiology, the First Hospital of China Medical University, Shenyang, China
| |
Collapse
|
25
|
Luo Y, Yang Q, Liu Z, Shi Z, Huang W, Zheng G, Cheng J. Target-Guided Diffusion Models for Unpaired Cross-Modality Medical Image Translation. IEEE J Biomed Health Inform 2024; 28:4062-4071. [PMID: 38662561 DOI: 10.1109/jbhi.2024.3393870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/03/2024]
Abstract
In a clinical setting, the acquisition of certain medical image modality is often unavailable due to various considerations such as cost, radiation, etc. Therefore, unpaired cross-modality translation techniques, which involve training on the unpaired data and synthesizing the target modality with the guidance of the acquired source modality, are of great interest. Previous methods for synthesizing target medical images are to establish one-shot mapping through generative adversarial networks (GANs). As promising alternatives to GANs, diffusion models have recently received wide interests in generative tasks. In this paper, we propose a target-guided diffusion model (TGDM) for unpaired cross-modality medical image translation. For training, to encourage our diffusion model to learn more visual concepts, we adopted a perception prioritized weight scheme (P2W) to the training objectives. For sampling, a pre-trained classifier is adopted in the reverse process to relieve modality-specific remnants from source data. Experiments on both brain MRI-CT and prostate MRI-US datasets demonstrate that the proposed method achieves a visually realistic result that mimics a vivid anatomical section of the target organ. In addition, we have also conducted a subjective assessment based on the synthesized samples to further validate the clinical value of TGDM.
Collapse
|
26
|
Meng X, Sun K, Xu J, He X, Shen D. Multi-Modal Modality-Masked Diffusion Network for Brain MRI Synthesis With Random Modality Missing. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2587-2598. [PMID: 38393846 DOI: 10.1109/tmi.2024.3368664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/25/2024]
Abstract
Synthesis of unavailable imaging modalities from available ones can generate modality-specific complementary information and enable multi-modality based medical images diagnosis or treatment. Existing generative methods for medical image synthesis are usually based on cross-modal translation between acquired and missing modalities. These methods are usually dedicated to specific missing modality and perform synthesis in one shot, which cannot deal with varying number of missing modalities flexibly and construct the mapping across modalities effectively. To address the above issues, in this paper, we propose a unified Multi-modal Modality-masked Diffusion Network (M2DN), tackling multi-modal synthesis from the perspective of "progressive whole-modality inpainting", instead of "cross-modal translation". Specifically, our M2DN considers the missing modalities as random noise and takes all the modalities as a unity in each reverse diffusion step. The proposed joint synthesis scheme performs synthesis for the missing modalities and self-reconstruction for the available ones, which not only enables synthesis for arbitrary missing scenarios, but also facilitates the construction of common latent space and enhances the model representation ability. Besides, we introduce a modality-mask scheme to encode availability status of each incoming modality explicitly in a binary mask, which is adopted as condition for the diffusion model to further enhance the synthesis performance of our M2DN for arbitrary missing scenarios. We carry out experiments on two public brain MRI datasets for synthesis and downstream segmentation tasks. Experimental results demonstrate that our M2DN outperforms the state-of-the-art models significantly and shows great generalizability for arbitrary missing modalities.
Collapse
|
27
|
Guan Y, Li Y, Ke Z, Peng X, Liu R, Li Y, Du YP, Liang ZP. Learning-Assisted Fast Determination of Regularization Parameter in Constrained Image Reconstruction. IEEE Trans Biomed Eng 2024; 71:2253-2264. [PMID: 38376982 DOI: 10.1109/tbme.2024.3367762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2024]
Abstract
OBJECTIVE To leverage machine learning (ML) for fast selection of optimal regularization parameter in constrained image reconstruction. METHODS Constrained image reconstruction is often formulated as a regularization problem and selecting a good regularization parameter value is an essential step. We solved this problem using an ML-based approach by leveraging the finding that for a specific constrained reconstruction problem defined for a fixed class of image functions, the optimal regularization parameter value is weakly subject-dependent and the dependence can be captured using few experimental data. The proposed method has four key steps: a) solution of a given constrained reconstruction problem for a few (say, 3) pre-selected regularization parameter values, b) extraction of multiple approximated quality metrics from the initial reconstructions, c) predicting the true quality metrics values from the approximated values using pre-trained neural networks, and d) determination of the optimal regularization parameter by fusing the predicted quality metrics. RESULTS The effectiveness of the proposed method was demonstrated in two constrained reconstruction problems. Compared with L-curve-based method, the proposed method determined the regularization parameters much faster and produced substantially improved reconstructions. Our method also outperformed state-of-the-art learning-based methods when trained with limited experimental data. CONCLUSION This paper demonstrates the feasibility and improved reconstruction quality by using machine learning to determine the regularization parameter in constrained reconstruction. SIGNIFICANCE The proposed method substantially reduces the computational burden of the traditional methods (e.g., L-curve) or relaxes the requirement of large training data by modern learning-based methods, thus enhancing the practical utility of constrained reconstruction.
Collapse
|
28
|
Zhang D, Duan C, Anazodo U, Wang ZJ, Lou X. Self-supervised anatomical continuity enhancement network for 7T SWI synthesis from 3T SWI. Med Image Anal 2024; 95:103184. [PMID: 38723320 DOI: 10.1016/j.media.2024.103184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 03/13/2024] [Accepted: 04/18/2024] [Indexed: 06/01/2024]
Abstract
Synthesizing 7T Susceptibility Weighted Imaging (SWI) from 3T SWI could offer significant clinical benefits by combining the high sensitivity of 7T SWI for neurological disorders with the widespread availability of 3T SWI in diagnostic routines. Although methods exist for synthesizing 7T Magnetic Resonance Imaging (MRI), they primarily focus on traditional MRI modalities like T1-weighted imaging, rather than SWI. SWI poses unique challenges, including limited data availability and the invisibility of certain tissues in individual 3T SWI slices. To address these challenges, we propose a Self-supervised Anatomical Continuity Enhancement (SACE) network to synthesize 7T SWI from 3T SWI using plentiful 3T SWI data and limited 3T-7T paired data. The SACE employs two specifically designed pretext tasks to utilize low-level representations from abundant 3T SWI data for assisting 7T SWI synthesis in a downstream task with limited paired data. One pretext task emphasizes input-specific morphology by balancing the elimination of redundant patterns with the preservation of essential morphology, preventing the blurring of synthetic 7T SWI images. The other task improves the synthesis of tissues that are invisible in a single 3T SWI slice by aligning adjacent slices with the current slice and predicting their difference fields. The downstream task innovatively combines clinical knowledge with brain substructure diagrams to selectively enhance clinically relevant features. When evaluated on a dataset comprising 97 cases (5495 slices), the proposed method achieved a Peak Signal-to-Noise Ratio (PSNR) of 23.05 dB and a Structural Similarity Index (SSIM) of 0.688. Due to the absence of specific methods for 7T SWI, our method was compared with existing enhancement techniques for general 7T MRI synthesis, outperforming these techniques in the context of 7T SWI synthesis. Clinical evaluations have shown that our synthetic 7T SWI is clinically effective, demonstrating its potential as a clinical tool.
Collapse
Affiliation(s)
- Dong Zhang
- Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, BC, Canada
| | - Caohui Duan
- Department of Radiology, Chinese PLA General Hospital, Beijing, China
| | - Udunna Anazodo
- Department of Neurology and Neurosurgery, McGill University, Montreal, QC, Canada
| | - Z Jane Wang
- Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, BC, Canada
| | - Xin Lou
- Department of Radiology, Chinese PLA General Hospital, Beijing, China.
| |
Collapse
|
29
|
Pan Q, Shen H, Li P, Lai B, Jiang A, Huang W, Lu F, Peng H, Fang L, Kuebler WM, Pries AR, Ning G. In Silico Design of Heterogeneous Microvascular Trees Using Generative Adversarial Networks and Constrained Constructive Optimization. Microcirculation 2024; 31:e12854. [PMID: 38690631 DOI: 10.1111/micc.12854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 04/01/2024] [Accepted: 04/10/2024] [Indexed: 05/02/2024]
Abstract
OBJECTIVE Designing physiologically adequate microvascular trees is of crucial relevance for bioengineering functional tissues and organs. Yet, currently available methods are poorly suited to replicate the morphological and topological heterogeneity of real microvascular trees because the parameters used to control tree generation are too simplistic to mimic results of the complex angiogenetic and structural adaptation processes in vivo. METHODS We propose a method to overcome this limitation by integrating a conditional deep convolutional generative adversarial network (cDCGAN) with a local fractal dimension-oriented constrained constructive optimization (LFDO-CCO) strategy. The cDCGAN learns the patterns of real microvascular bifurcations allowing for their artificial replication. The LFDO-CCO strategy connects the generated bifurcations hierarchically to form microvascular trees with a vessel density corresponding to that observed in healthy tissues. RESULTS The generated artificial microvascular trees are consistent with real microvascular trees regarding characteristics such as fractal dimension, vascular density, and coefficient of variation of diameter, length, and tortuosity. CONCLUSIONS These results support the adoption of the proposed strategy for the generation of artificial microvascular trees in tissue engineering as well as for computational modeling and simulations of microcirculatory physiology.
Collapse
Affiliation(s)
- Qing Pan
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, China
| | - Huanghui Shen
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, China
| | - Peilun Li
- Department of Biomedical Engineering, Key Laboratory of Biomedical Engineering of MOE, Zhejiang University, Hangzhou, China
| | - Biyun Lai
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, China
| | - Akang Jiang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, China
| | - Wenjie Huang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, China
| | - Fei Lu
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, China
| | - Hong Peng
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, China
| | - Luping Fang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, China
| | - Wolfgang M Kuebler
- Institute of Physiology, Charité Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Axel R Pries
- Institute of Physiology, Charité Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany
- Department of Medicine, Faculty of Medicine and Dentistry, Danube Private University, Krems, Austria
| | - Gangmin Ning
- Department of Biomedical Engineering, Key Laboratory of Biomedical Engineering of MOE, Zhejiang University, Hangzhou, China
| |
Collapse
|
30
|
Fard AS, Reutens DC, Ramsay SC, Goodman SJ, Ghosh S, Vegh V. Image synthesis of interictal SPECT from MRI and PET using machine learning. Front Neurol 2024; 15:1383773. [PMID: 38988603 PMCID: PMC11234346 DOI: 10.3389/fneur.2024.1383773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Accepted: 06/12/2024] [Indexed: 07/12/2024] Open
Abstract
Background Cross-modality image estimation can be performed using generative adversarial networks (GANs). To date, SPECT image estimation from another medical imaging modality using this technique has not been considered. We evaluate the estimation of SPECT from MRI and PET, and additionally assess the necessity for cross-modality image registration for GAN training. Methods We estimated interictal SPECT from PET and MRI as a single-channel input, and as a multi-channel input to the GAN. We collected data from 48 individuals with epilepsy and converted them to 3D isotropic images for consistence across the modalities. Training and testing data were prepared in native and template spaces. The Pix2pix framework within the GAN network was adopted. We evaluated the addition of the structural similarity index metric to the loss function in the GAN implementation. Root-mean-square error, structural similarity index, and peak signal-to-noise ratio were used to assess how well SPECT images were able to be synthesised. Results High quality SPECT images could be synthesised in each case. On average, the use of native space images resulted in a 5.4% percentage improvement in SSIM than the use of images registered to template space. The addition of structural similarity index metric to the GAN loss function did not result in improved synthetic SPECT images. Using PET in either the single channel or dual channel implementation led to the best results, however MRI could produce SPECT images close in quality. Conclusion Synthesis of SPECT from MRI or PET can potentially reduce the number of scans needed for epilepsy patient evaluation and reduce patient exposure to radiation.
Collapse
Affiliation(s)
- Azin Shokraei Fard
- Centre for Advanced Imaging, University of Queensland, Brisbane, QLD, Australia
| | - David C. Reutens
- Centre for Advanced Imaging, University of Queensland, Brisbane, QLD, Australia
- Royal Brisbane and Women’s Hospital, Brisbane, QLD, Australia
- ARC Training Centre for Innovation in Biomedical Imaging Technology, Brisbane, QLD, Australia
| | | | | | - Soumen Ghosh
- Centre for Advanced Imaging, University of Queensland, Brisbane, QLD, Australia
- ARC Training Centre for Innovation in Biomedical Imaging Technology, Brisbane, QLD, Australia
| | - Viktor Vegh
- Centre for Advanced Imaging, University of Queensland, Brisbane, QLD, Australia
- ARC Training Centre for Innovation in Biomedical Imaging Technology, Brisbane, QLD, Australia
| |
Collapse
|
31
|
Park Y, Lee MJ, Yoo S, Kim CY, Namgung JY, Park Y, Park H, Lee EC, Yoon YD, Paquola C, Bernhardt BC, Park BY. GAN-MAT: Generative adversarial network-based microstructural profile covariance analysis toolbox. Neuroimage 2024; 291:120595. [PMID: 38554782 DOI: 10.1016/j.neuroimage.2024.120595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2023] [Revised: 03/25/2024] [Accepted: 03/28/2024] [Indexed: 04/02/2024] Open
Abstract
Multimodal magnetic resonance imaging (MRI) provides complementary information for investigating brain structure and function; for example, an in vivo microstructure-sensitive proxy can be estimated using the ratio between T1- and T2-weighted structural MRI. However, acquiring multiple imaging modalities is challenging in patients with inattentive disorders. In this study, we proposed a comprehensive framework to provide multiple imaging features related to the brain microstructure using only T1-weighted MRI. Our toolbox consists of (i) synthesizing T2-weighted MRI from T1-weighted MRI using a conditional generative adversarial network; (ii) estimating microstructural features, including intracortical covariance and moment features of cortical layer-wise microstructural profiles; and (iii) generating a microstructural gradient, which is a low-dimensional representation of the intracortical microstructure profile. We trained and tested our toolbox using T1- and T2-weighted MRI scans of 1,104 healthy young adults obtained from the Human Connectome Project database. We found that the synthesized T2-weighted MRI was very similar to the actual image and that the synthesized data successfully reproduced the microstructural features. The toolbox was validated using an independent dataset containing healthy controls and patients with episodic migraine as well as the atypical developmental condition of autism spectrum disorder. Our toolbox may provide a new paradigm for analyzing multimodal structural MRI in the neuroscience community and is openly accessible at https://github.com/CAMIN-neuro/GAN-MAT.
Collapse
Affiliation(s)
- Yeongjun Park
- Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon, South Korea
| | - Mi Ji Lee
- Department of Neurology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, South Korea
| | | | - Chae Yeon Kim
- Department of Data Science, Inha University, Incheon, South Korea
| | | | - Yunseo Park
- Department of Data Science, Inha University, Incheon, South Korea
| | - Hyunjin Park
- School of Electronic and Electrical Engineering, Sungkyunkwan University, Suwon, South Korea; Center for Neuroscience Imaging Research, Institute for Basic Science, Suwon, South Korea
| | | | | | - Casey Paquola
- Institute of Neuroscience and Medicine (INM-1), Forschungszentrum Jülich, Jülich, Germany
| | - Boris C Bernhardt
- McConnell Brain Imaging Centre, Montreal Neurological Institute and Hospital, McGill University, Montreal, Quebec, Canada
| | - Bo-Yong Park
- Department of Data Science, Inha University, Incheon, South Korea; Center for Neuroscience Imaging Research, Institute for Basic Science, Suwon, South Korea; Department of Statistics and Data Science, Inha University, Incheon, South Korea.
| |
Collapse
|
32
|
Kim J, Li Y, Shin BS. 3D-DGGAN: A Data-Guided Generative Adversarial Network for High Fidelity in Medical Image Generation. IEEE J Biomed Health Inform 2024; 28:2904-2915. [PMID: 38416610 DOI: 10.1109/jbhi.2024.3367375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2024]
Abstract
Three-dimensional images are frequently used in medical imaging research for classification, segmentation, and detection. However, the limited availability of 3D images hinders research progress due to network training difficulties. Generative methods have been proposed to create medical images using AI techniques. Nevertheless, 2D approaches have difficulty dealing with 3D anatomical structures, which can result in discontinuities between slices. To mitigate these discontinuities, several 3D generative networks have been proposed. However, the scarcity of available 3D images makes training these networks with limited samples inadequate for producing high-fidelity 3D images. We propose a data-guided generative adversarial network to provide high fidelity in 3D image generation. The generator creates fake images with noise using reference code obtained by extracting features from real images. The generator also creates decoded images using reference code without noise. These decoded images are compared to the real images to evaluate fidelity in the reference code. This generation process can create high-fidelity 3D images from only a small amount of real training data. Additionally, our method employs three types of discriminator: volume (evaluates all the slices), slab (evaluates a set of consecutive slices), and slice (evaluates randomly selected slices). The proposed discriminator enhances fidelity by differentiating between real and fake images based on detailed characteristics. Results from our method are compared with existing methods by using quantitative analysis such as Fréchet inception distance and maximum mean discrepancy. The results demonstrate that our method produces more realistic 3D images than existing methods.
Collapse
|
33
|
Duan C, Bian X, Cheng K, Lyu J, Xiong Y, Xiao S, Wang X, Duan Q, Li C, Huang J, Hu J, Wang ZJ, Zhou X, Lou X. Synthesized 7T MPRAGE From 3T MPRAGE Using Generative Adversarial Network and Validation in Clinical Brain Imaging: A Feasibility Study. J Magn Reson Imaging 2024; 59:1620-1629. [PMID: 37559435 DOI: 10.1002/jmri.28944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 07/24/2023] [Accepted: 07/26/2023] [Indexed: 08/11/2023] Open
Abstract
BACKGROUND Ultra-high field 7T MRI can provide excellent tissue contrast and anatomical details, but is often cost prohibitive, and is not widely accessible in clinical practice. PURPOSE To generate synthetic 7T images from widely acquired 3T images with deep learning and to evaluate the feasibility of this approach for brain imaging. STUDY TYPE Prospective. POPULATION 33 healthy volunteers and 89 patients with brain diseases, divided into training, and evaluation datasets in the ratio 4:1. SEQUENCE AND FIELD STRENGTH T1-weighted nonenhanced or contrast-enhanced magnetization-prepared rapid acquisition gradient-echo sequence at both 3T and 7T. ASSESSMENT A generative adversarial network (SynGAN) was developed to produce synthetic 7T images from 3T images as input. SynGAN training and evaluation were performed separately for nonenhanced and contrast-enhanced paired acquisitions. Qualitative image quality of acquired 3T and 7T images and of synthesized 7T images was evaluated by three radiologists in terms of overall image quality, artifacts, sharpness, contrast, and visualization of vessel using 5-point Likert scales. STATISTICAL TESTS Wilcoxon signed rank tests to compare synthetic 7T images with acquired 7T and 3T images and intraclass correlation coefficients to evaluate interobserver variability. P < 0.05 was considered significant. RESULTS Of the 122 paired 3T and 7T MRI scans, 66 were acquired without contrast agent and 56 with contrast agent. The average time to generate synthetic images was ~11.4 msec per slice (2.95 sec per participant). The synthetic 7T images achieved significantly improved tissue contrast and sharpness in comparison to 3T images in both nonenhanced and contrast-enhanced subgroups. Meanwhile, there was no significant difference between acquired 7T and synthetic 7T images in terms of all the evaluation criteria for both nonenhanced and contrast-enhanced subgroups (P ≥ 0.180). DATA CONCLUSION The deep learning model has potential to generate synthetic 7T images with similar image quality to acquired 7T images. LEVEL OF EVIDENCE 2 TECHNICAL EFFICACY: Stage 1.
Collapse
Affiliation(s)
- Caohui Duan
- Department of Radiology, Chinese PLA General Hospital, Beijing, China
| | - Xiangbing Bian
- Department of Radiology, Chinese PLA General Hospital, Beijing, China
| | - Kun Cheng
- Department of Radiology, Chinese PLA General Hospital, Beijing, China
| | - Jinhao Lyu
- Department of Radiology, Chinese PLA General Hospital, Beijing, China
| | - Yongqin Xiong
- Department of Radiology, Chinese PLA General Hospital, Beijing, China
| | - Sa Xiao
- Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences-Wuhan National Laboratory for Optoelectronics, Wuhan, China
| | - Xueyang Wang
- Department of Radiology, Chinese PLA General Hospital, Beijing, China
| | - Qi Duan
- Department of Radiology, Chinese PLA General Hospital, Beijing, China
| | - Chenxi Li
- Department of Radiology, Chinese PLA General Hospital, Beijing, China
| | - Jiayu Huang
- Department of Radiology, Chinese PLA General Hospital, Beijing, China
| | - Jianxing Hu
- Department of Radiology, Chinese PLA General Hospital, Beijing, China
| | - Z Jane Wang
- Department of Electrical and Computer Engineering, The University of British Columbia, Vancouver, British Columbia, Canada
| | - Xin Zhou
- Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences-Wuhan National Laboratory for Optoelectronics, Wuhan, China
| | - Xin Lou
- Department of Radiology, Chinese PLA General Hospital, Beijing, China
| |
Collapse
|
34
|
Ma Y, Zhou W, Ma R, Wang E, Yang S, Tang Y, Zhang XP, Guan X. DOVE: Doodled vessel enhancement for photoacoustic angiography super resolution. Med Image Anal 2024; 94:103106. [PMID: 38387244 DOI: 10.1016/j.media.2024.103106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 12/12/2023] [Accepted: 02/08/2024] [Indexed: 02/24/2024]
Abstract
Deep-learning-based super-resolution photoacoustic angiography (PAA) has emerged as a valuable tool for enhancing the resolution of blood vessel images and aiding in disease diagnosis. However, due to the scarcity of training samples, PAA super-resolution models do not generalize well, especially in the challenging in-vivo imaging of organs with deep tissue penetration. Furthermore, prolonged exposure to high laser intensity during the image acquisition process can lead to tissue damage and secondary infections. To address these challenges, we propose an approach doodled vessel enhancement (DOVE) that utilizes hand-drawn doodles to train a PAA super-resolution model. With a training dataset consisting of only 32 real PAA images, we construct a diffusion model that interprets hand-drawn doodles as low-resolution images. DOVE enables us to generate a large number of realistic PAA images, achieving a 49.375% fool rate, even among experts in photoacoustic imaging. Subsequently, we employ these generated images to train a self-similarity-based model for super-resolution. During cross-domain tests, our method, trained solely on generated images, achieves a structural similarity value of 0.8591, surpassing the scores of all other models trained with real high-resolution images. DOVE successfully overcomes the limitation of insufficient training samples and unlocks the clinic application potential of super-resolution-based biomedical imaging.
Collapse
Affiliation(s)
- Yuanzheng Ma
- Tsinghua-Berkeley Shenzhen Institute, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China; Institute of Data and Information, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China
| | - Wangting Zhou
- Engineering Research Center of Molecular & Neuro Imaging of the Ministry of Education, Xidian University, Xi'an, Shaanxi 710126, China
| | - Rui Ma
- MOE Key Laboratory of Laser Life Science & Institute of Laser Life Science, College of Biophotonics, South China Normal University, Guangzhou, 510631, China
| | - Erqi Wang
- MOE Key Laboratory of Laser Life Science & Institute of Laser Life Science, College of Biophotonics, South China Normal University, Guangzhou, 510631, China
| | - Sihua Yang
- MOE Key Laboratory of Laser Life Science & Institute of Laser Life Science, College of Biophotonics, South China Normal University, Guangzhou, 510631, China.
| | - Yansong Tang
- Tsinghua-Berkeley Shenzhen Institute, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China; Institute of Data and Information, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China
| | - Xiao-Ping Zhang
- Tsinghua-Berkeley Shenzhen Institute, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China; Institute of Data and Information, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China
| | - Xun Guan
- Tsinghua-Berkeley Shenzhen Institute, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China; Institute of Data and Information, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, China.
| |
Collapse
|
35
|
Dalmaz O, Mirza MU, Elmas G, Ozbey M, Dar SUH, Ceyani E, Oguz KK, Avestimehr S, Çukur T. One model to unite them all: Personalized federated learning of multi-contrast MRI synthesis. Med Image Anal 2024; 94:103121. [PMID: 38402791 DOI: 10.1016/j.media.2024.103121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 02/20/2024] [Accepted: 02/21/2024] [Indexed: 02/27/2024]
Abstract
Curation of large, diverse MRI datasets via multi-institutional collaborations can help improve learning of generalizable synthesis models that reliably translate source- onto target-contrast images. To facilitate collaborations, federated learning (FL) adopts decentralized model training while mitigating privacy concerns by avoiding sharing of imaging data. However, conventional FL methods can be impaired by the inherent heterogeneity in the data distribution, with domain shifts evident within and across imaging sites. Here we introduce the first personalized FL method for MRI Synthesis (pFLSynth) that improves reliability against data heterogeneity via model specialization to individual sites and synthesis tasks (i.e., source-target contrasts). To do this, pFLSynth leverages an adversarial model equipped with novel personalization blocks that control the statistics of generated feature maps across the spatial/channel dimensions, given latent variables specific to sites and tasks. To further promote communication efficiency and site specialization, partial network aggregation is employed over later generator stages while earlier generator stages and the discriminator are trained locally. As such, pFLSynth enables multi-task training of multi-site synthesis models with high generalization performance across sites and tasks. Comprehensive experiments demonstrate the superior performance and reliability of pFLSynth in MRI synthesis against prior federated methods.
Collapse
Affiliation(s)
- Onat Dalmaz
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara 06800, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara 06800, Turkey
| | - Muhammad U Mirza
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara 06800, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara 06800, Turkey
| | - Gokberk Elmas
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara 06800, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara 06800, Turkey
| | - Muzaffer Ozbey
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara 06800, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara 06800, Turkey
| | - Salman U H Dar
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara 06800, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara 06800, Turkey
| | - Emir Ceyani
- Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA 90089, USA
| | - Kader K Oguz
- Department of Radiology, University of California, Davis Medical Center, Sacramento, CA 95817, USA
| | - Salman Avestimehr
- Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA 90089, USA
| | - Tolga Çukur
- Department of Electrical and Electronics Engineering, Bilkent University, Ankara 06800, Turkey; National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara 06800, Turkey; Neuroscience Program, Bilkent University, Ankara 06800, Turkey.
| |
Collapse
|
36
|
Sun Y, Wang Y, Gan K, Wang Y, Chen Y, Ge Y, Yuan J, Xu H. Reliable Delineation of Clinical Target Volumes for Cervical Cancer Radiotherapy on CT/MR Dual-Modality Images. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024; 37:575-588. [PMID: 38343225 PMCID: PMC11031539 DOI: 10.1007/s10278-023-00951-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 10/10/2023] [Accepted: 10/10/2023] [Indexed: 04/20/2024]
Abstract
Accurate delineation of the clinical target volume (CTV) is a crucial prerequisite for safe and effective radiotherapy characterized. This study addresses the integration of magnetic resonance (MR) images to aid in target delineation on computed tomography (CT) images. However, obtaining MR images directly can be challenging. Therefore, we employ AI-based image generation techniques to "intelligentially generate" MR images from CT images to improve CTV delineation based on CT images. To generate high-quality MR images, we propose an attention-guided single-loop image generation model. The model can yield higher-quality images by introducing an attention mechanism in feature extraction and enhancing the loss function. Based on the generated MR images, we propose a CTV segmentation model fusing multi-scale features through image fusion and a hollow space pyramid module to enhance segmentation accuracy. The image generation model used in this study improves the peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) from 14.87 and 0.58 to 16.72 and 0.67, respectively, and improves the feature distribution distance and learning-perception image similarity from 180.86 and 0.28 to 110.98 and 0.22, achieving higher quality image generation. The proposed segmentation method demonstrates high accuracy, compared with the FCN method, the intersection over union ratio and the Dice coefficient are improved from 0.8360 and 0.8998 to 0.9043 and 0.9473, respectively. Hausdorff distance and mean surface distance decreased from 5.5573 mm and 2.3269 mm to 4.7204 mm and 0.9397 mm, respectively, achieving clinically acceptable segmentation accuracy. Our method might reduce physicians' manual workload and accelerate the diagnosis and treatment process while decreasing inter-observer variability in identifying anatomical structures.
Collapse
Affiliation(s)
- Ying Sun
- School of Electronic Science and Engineering, Nanjing University, Nanjing, China
| | - Yuening Wang
- School of Electronic Science and Engineering, Nanjing University, Nanjing, China
| | - Kexin Gan
- School of Electronic Science and Engineering, Nanjing University, Nanjing, China
| | - Yuxin Wang
- School of Electronic Science and Engineering, Nanjing University, Nanjing, China
| | - Ying Chen
- School of Electronic Science and Engineering, Nanjing University, Nanjing, China
| | - Yun Ge
- School of Electronic Science and Engineering, Nanjing University, Nanjing, China
| | - Jie Yuan
- School of Electronic Science and Engineering, Nanjing University, Nanjing, China.
| | - Hanzi Xu
- Jiangsu Cancer Hospital, Nanjing, China.
| |
Collapse
|
37
|
Jiang M, Wang S, Song Z, Song L, Wang Y, Zhu C, Zheng Q. Cross 2SynNet: cross-device-cross-modal synthesis of routine brain MRI sequences from CT with brain lesion. MAGMA (NEW YORK, N.Y.) 2024; 37:241-256. [PMID: 38315352 DOI: 10.1007/s10334-023-01145-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 11/28/2023] [Accepted: 12/27/2023] [Indexed: 02/07/2024]
Abstract
OBJECTIVES CT and MR are often needed to determine the location and extent of brain lesions collectively to improve diagnosis. However, patients with acute brain diseases cannot complete the MRI examination within a short time. The aim of the study is to devise a cross-device and cross-modal medical image synthesis (MIS) method Cross2SynNet for synthesizing routine brain MRI sequences of T1WI, T2WI, FLAIR, and DWI from CT with stroke and brain tumors. MATERIALS AND METHODS For the retrospective study, the participants covered four different diseases of cerebral ischemic stroke (CIS-cohort), cerebral hemorrhage (CH-cohort), meningioma (M-cohort), glioma (G-cohort). The MIS model Cross2SynNet was established on the basic architecture of conditional generative adversarial network (CGAN), of which, the fully convolutional Transformer (FCT) module was adopted into generator to capture the short- and long-range dependencies between healthy and pathological tissues, and the edge loss function was to minimize the difference in gradient magnitude between synthetic image and ground truth. Three metrics of mean square error (MSE), peak signal-to-noise ratio (PSNR), and structure similarity index measure (SSIM) were used for evaluation. RESULTS A total of 230 participants (mean patient age, 59.77 years ± 13.63 [standard deviation]; 163 men [71%] and 67 women [29%]) were included, including CIS-cohort (95 participants between Dec 2019 and Feb 2022), CH-cohort (69 participants between Jan 2020 and Dec 2021), M-cohort (40 participants between Sep 2018 and Dec 2021), and G-cohort (26 participants between Sep 2019 and Dec 2021). The Cross2SynNet achieved averaged values of MSE = 0.008, PSNR = 21.728, and SSIM = 0.758 when synthesizing MRIs from CT, outperforming the CycleGAN, pix2pix, RegGAN, Pix2PixHD, and ResViT. The Cross2SynNet could synthesize the brain lesion on pseudo DWI even if the CT image did not exhibit clear signal in the acute ischemic stroke patients. CONCLUSIONS Cross2SynNet could achieve routine brain MRI synthesis of T1WI, T2WI, FLAIR, and DWI from CT with promising performance given the brain lesion of stroke and brain tumor.
Collapse
Affiliation(s)
- Minbo Jiang
- School of Computer and Control Engineering, Yantai University, No 30, Qingquan Road, Laishan District, Yantai, 264005, Shandong, China
| | - Shuai Wang
- Department of Radiology, Binzhou Medical University Hospital, Binzhou, 256603, China
| | - Zhiwei Song
- School of Computer and Control Engineering, Yantai University, No 30, Qingquan Road, Laishan District, Yantai, 264005, Shandong, China
| | - Limei Song
- School of Medical Imaging, Weifang Medical University, Weifang, 261000, China
| | - Yi Wang
- School of Computer and Control Engineering, Yantai University, No 30, Qingquan Road, Laishan District, Yantai, 264005, Shandong, China
| | - Chuanzhen Zhu
- School of Computer and Control Engineering, Yantai University, No 30, Qingquan Road, Laishan District, Yantai, 264005, Shandong, China
| | - Qiang Zheng
- School of Computer and Control Engineering, Yantai University, No 30, Qingquan Road, Laishan District, Yantai, 264005, Shandong, China.
| |
Collapse
|
38
|
Bottani S, Thibeau-Sutre E, Maire A, Ströer S, Dormont D, Colliot O, Burgos N. Contrast-enhanced to non-contrast-enhanced image translation to exploit a clinical data warehouse of T1-weighted brain MRI. BMC Med Imaging 2024; 24:67. [PMID: 38504179 PMCID: PMC10953143 DOI: 10.1186/s12880-024-01242-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 03/07/2024] [Indexed: 03/21/2024] Open
Abstract
BACKGROUND Clinical data warehouses provide access to massive amounts of medical images, but these images are often heterogeneous. They can for instance include images acquired both with or without the injection of a gadolinium-based contrast agent. Harmonizing such data sets is thus fundamental to guarantee unbiased results, for example when performing differential diagnosis. Furthermore, classical neuroimaging software tools for feature extraction are typically applied only to images without gadolinium. The objective of this work is to evaluate how image translation can be useful to exploit a highly heterogeneous data set containing both contrast-enhanced and non-contrast-enhanced images from a clinical data warehouse. METHODS We propose and compare different 3D U-Net and conditional GAN models to convert contrast-enhanced T1-weighted (T1ce) into non-contrast-enhanced (T1nce) brain MRI. These models were trained using 230 image pairs and tested on 77 image pairs from the clinical data warehouse of the Greater Paris area. RESULTS Validation using standard image similarity measures demonstrated that the similarity between real and synthetic T1nce images was higher than between real T1nce and T1ce images for all the models compared. The best performing models were further validated on a segmentation task. We showed that tissue volumes extracted from synthetic T1nce images were closer to those of real T1nce images than volumes extracted from T1ce images. CONCLUSION We showed that deep learning models initially developed with research quality data could synthesize T1nce from T1ce images of clinical quality and that reliable features could be extracted from the synthetic images, thus demonstrating the ability of such methods to help exploit a data set coming from a clinical data warehouse.
Collapse
Affiliation(s)
- Simona Bottani
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, CNRS, Inria, Inserm, AP-HP, Hôpital de la Pitié-Salpêtrière, Paris, 75013, France
| | - Elina Thibeau-Sutre
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, CNRS, Inria, Inserm, AP-HP, Hôpital de la Pitié-Salpêtrière, Paris, 75013, France
| | - Aurélien Maire
- Innovation & Données - Département des Services Numériques, AP-HP, Paris, 75013, France
| | - Sebastian Ströer
- Hôpital Pitié Salpêtrière, Department of Neuroradiology, AP-HP, Paris, 75012, France
| | - Didier Dormont
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, CNRS, Inria, Inserm, AP-HP, Hôpital de la Pitié-Salpêtrière, DMU DIAMENT, Paris, 75013, France
| | - Olivier Colliot
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, CNRS, Inria, Inserm, AP-HP, Hôpital de la Pitié-Salpêtrière, Paris, 75013, France
| | - Ninon Burgos
- Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, CNRS, Inria, Inserm, AP-HP, Hôpital de la Pitié-Salpêtrière, Paris, 75013, France.
| |
Collapse
|
39
|
Hamghalam M, Simpson AL. Medical image synthesis via conditional GANs: Application to segmenting brain tumours. Comput Biol Med 2024; 170:107982. [PMID: 38266466 DOI: 10.1016/j.compbiomed.2024.107982] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 12/30/2023] [Accepted: 01/13/2024] [Indexed: 01/26/2024]
Abstract
Accurate brain tumour segmentation is critical for tasks such as surgical planning, diagnosis, and analysis, with magnetic resonance imaging (MRI) being the preferred modality due to its excellent visualisation of brain tissues. However, the wide intensity range of voxel values in MR scans often results in significant overlap between the density distributions of different tumour tissues, leading to reduced contrast and segmentation accuracy. This paper introduces a novel framework based on conditional generative adversarial networks (cGANs) aimed at enhancing the contrast of tumour subregions for both voxel-wise and region-wise segmentation approaches. We present two models: Enhancement and Segmentation GAN (ESGAN), which combines classifier loss with adversarial loss to predict central labels of input patches, and Enhancement GAN (EnhGAN), which generates high-contrast synthetic images with reduced inter-class overlap. These synthetic images are then fused with corresponding modalities to emphasise meaningful tissues while suppressing weaker ones. We also introduce a novel generator that adaptively calibrates voxel values within input patches, leveraging fully convolutional networks. Both models employ a multi-scale Markovian network as a GAN discriminator to capture local patch statistics and estimate the distribution of MR images in complex contexts. Experimental results on publicly available MR brain tumour datasets demonstrate the competitive accuracy of our models compared to current brain tumour segmentation techniques.
Collapse
Affiliation(s)
- Mohammad Hamghalam
- School of Computing, Queen's University, Kingston, ON, Canada; Department of Electrical Engineering, Qazvin Branch, Islamic Azad University, Qazvin, Iran.
| | - Amber L Simpson
- School of Computing, Queen's University, Kingston, ON, Canada; Department of Biomedical and Molecular Sciences, Queen's University, Kingston, ON, Canada.
| |
Collapse
|
40
|
Sun H, Yang Z, Zhu J, Li J, Gong J, Chen L, Wang Z, Yin Y, Ren G, Cai J, Zhao L. Pseudo-medical image-guided technology based on 'CBCT-only' mode in esophageal cancer radiotherapy. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 245:108007. [PMID: 38241802 DOI: 10.1016/j.cmpb.2024.108007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 11/03/2023] [Accepted: 01/03/2024] [Indexed: 01/21/2024]
Abstract
Purpose To minimize the various errors introduced by image-guided radiotherapy (IGRT) in the application of esophageal cancer treatment, this study proposes a novel technique based on the 'CBCT-only' mode of pseudo-medical image guidance. Methods The framework of this technology consists of two pseudo-medical image synthesis models in the CBCT→CT and the CT→PET direction. The former utilizes a dual-domain parallel deep learning model called AWM-PNet, which incorporates attention waning mechanisms. This model effectively suppresses artifacts in CBCT images in both the sinogram and spatial domains while efficiently capturing important image features and contextual information. The latter leverages tumor location and shape information provided by clinical experts. It introduces a PRAM-GAN model based on a prior region aware mechanism to establish a non-linear mapping relationship between CT and PET image domains. As a result, it enables the generation of pseudo-PET images that meet the clinical requirements for radiotherapy. Results The NRMSE and multi-scale SSIM (MS-SSIM) were utilized to evaluate the test set, and the results were presented as median values with lower quartile and upper quartile ranges. For the AWM-PNet model, the NRMSE and MS-SSIM values were 0.0218 (0.0143, 0.0255) and 0.9325 (0.9141, 0.9410), respectively. The PRAM-GAN model produced NRMSE and MS-SSIM values of 0.0404 (0.0356, 0.0476) and 0.9154 (0.8971, 0.9294), respectively. Statistical analysis revealed significant differences (p < 0.05) between these models and others. The numerical results of dose metrics, including D98 %, Dmean, and D2 %, validated the accuracy of HU values in the pseudo-CT images synthesized by the AWM-PNet. Furthermore, the Dice coefficient results confirmed statistically significant differences (p < 0.05) in GTV delineation between the pseudo-PET images synthesized using the PRAM-GAN model and other compared methods. Conclusion The AWM-PNet and PRAM-GAN models have the capability to generate accurate pseudo-CT and pseudo-PET images, respectively. The pseudo-image-guided technique based on the 'CBCT-only' mode shows promising prospects for application in esophageal cancer radiotherapy.
Collapse
Affiliation(s)
- Hongfei Sun
- Department of Radiation Oncology, Xijing Hospital, Fourth Military Medical University, Xi'an, China
| | - Zhi Yang
- Department of Radiation Oncology, Xijing Hospital, Fourth Military Medical University, Xi'an, China
| | - Jiarui Zhu
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong, China
| | - Jie Li
- Department of Radiation Oncology, Xijing Hospital, Fourth Military Medical University, Xi'an, China
| | - Jie Gong
- Department of Radiation Oncology, Xijing Hospital, Fourth Military Medical University, Xi'an, China
| | - Liting Chen
- Department of Radiation Oncology, Xijing Hospital, Fourth Military Medical University, Xi'an, China
| | - Zhongfei Wang
- Department of Radiation Oncology, Xijing Hospital, Fourth Military Medical University, Xi'an, China
| | - Yutian Yin
- Department of Radiation Oncology, Xijing Hospital, Fourth Military Medical University, Xi'an, China
| | - Ge Ren
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong, China.
| | - Jing Cai
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong, China.
| | - Lina Zhao
- Department of Radiation Oncology, Xijing Hospital, Fourth Military Medical University, Xi'an, China.
| |
Collapse
|
41
|
Kim K, Cho K, Jang R, Kyung S, Lee S, Ham S, Choi E, Hong GS, Kim N. Updated Primer on Generative Artificial Intelligence and Large Language Models in Medical Imaging for Medical Professionals. Korean J Radiol 2024; 25:224-242. [PMID: 38413108 PMCID: PMC10912493 DOI: 10.3348/kjr.2023.0818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 11/27/2023] [Accepted: 12/28/2023] [Indexed: 02/29/2024] Open
Abstract
The emergence of Chat Generative Pre-trained Transformer (ChatGPT), a chatbot developed by OpenAI, has garnered interest in the application of generative artificial intelligence (AI) models in the medical field. This review summarizes different generative AI models and their potential applications in the field of medicine and explores the evolving landscape of Generative Adversarial Networks and diffusion models since the introduction of generative AI models. These models have made valuable contributions to the field of radiology. Furthermore, this review also explores the significance of synthetic data in addressing privacy concerns and augmenting data diversity and quality within the medical domain, in addition to emphasizing the role of inversion in the investigation of generative models and outlining an approach to replicate this process. We provide an overview of Large Language Models, such as GPTs and bidirectional encoder representations (BERTs), that focus on prominent representatives and discuss recent initiatives involving language-vision models in radiology, including innovative large language and vision assistant for biomedicine (LLaVa-Med), to illustrate their practical application. This comprehensive review offers insights into the wide-ranging applications of generative AI models in clinical research and emphasizes their transformative potential.
Collapse
Affiliation(s)
- Kiduk Kim
- Department of Convergence Medicine, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Republic of Korea
| | - Kyungjin Cho
- Department of Biomedical Engineering, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | | | - Sunggu Kyung
- Department of Biomedical Engineering, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Soyoung Lee
- Department of Biomedical Engineering, Asan Medical Institute of Convergence Science and Technology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Sungwon Ham
- Healthcare Readiness Institute for Unified Korea, Korea University Ansan Hospital, Korea University College of Medicine, Ansan, Republic of Korea
| | - Edward Choi
- Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
| | - Gil-Sun Hong
- Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Republic of Korea.
| | - Namkug Kim
- Department of Convergence Medicine, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Republic of Korea
- Department of Radiology and Research Institute of Radiology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, Republic of Korea.
| |
Collapse
|
42
|
Dai F, Li Y, Zhu Y, Li B, Shi Q, Chen Y, Ta D. B-mode ultrasound to elastography synthesis using multiscale learning. ULTRASONICS 2024; 138:107268. [PMID: 38402836 DOI: 10.1016/j.ultras.2024.107268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 02/03/2024] [Accepted: 02/09/2024] [Indexed: 02/27/2024]
Abstract
Elastography is a promising diagnostic tool that measures the hardness of tissues, and it has been used in clinics for detecting lesion progress, such as benign and malignant tumors. However, due to the high cost of examination and limited availability of elastic ultrasound devices, elastography is not widely used in primary medical facilities in rural areas. To address this issue, a deep learning approach called the multiscale elastic image synthesis network (MEIS-Net) was proposed, which utilized the multiscale learning to synthesize elastic images from ultrasound data instead of traditional ultrasound elastography in virtue of elastic deformation. The method integrates multi-scale features of the prostate in an innovative way and enhances the elastic synthesis effect through a fusion module. The module obtains B-mode ultrasound and elastography feature maps, which are used to generate local and global elastic ultrasound images through their correspondence. Finally, the two-channel images are synthesized into output elastic images. To evaluate the approach, quantitative assessments and diagnostic tests were conducted, comparing the results of MEIS-Net with several deep learning-based methods. The experiments showed that MEIS-Net was effective in synthesizing elastic images from B-mode ultrasound data acquired from two different devices, with a structural similarity index of 0.74 ± 0.04. This outperformed other methods such as Pix2Pix (0.69 ± 0.09), CycleGAN (0.11 ± 0.27), and StarGANv2 (0.02 ± 0.01). Furthermore, the diagnostic tests demonstrated that the classification performance of the synthetic elastic image was comparable to that of real elastic images, with only a 3 % decrease in the area under the curve (AUC), indicating the clinical effectiveness of the proposed method.
Collapse
Affiliation(s)
- Fei Dai
- Center for Biomedical Engineering, School of Information Science and Technology, Fudan University, Shanghai 200433, China
| | - Yifang Li
- Academy for Engineering and Technology, Fudan University, Shanghai 200433, China
| | - Yunkai Zhu
- Department of Ultrasound, Xinhua Hospital Affiliated to Shanghai Jiaotong University School of Medicine, Shanghai 200092, China
| | - Boyi Li
- Academy for Engineering and Technology, Fudan University, Shanghai 200433, China
| | - Qinzhen Shi
- Center for Biomedical Engineering, School of Information Science and Technology, Fudan University, Shanghai 200433, China
| | - Yaqing Chen
- Department of Ultrasound, Xinhua Hospital Affiliated to Shanghai Jiaotong University School of Medicine, Shanghai 200092, China.
| | - Dean Ta
- Center for Biomedical Engineering, School of Information Science and Technology, Fudan University, Shanghai 200433, China.
| |
Collapse
|
43
|
Curcuru AN, Yang D, An H, Cuculich PS, Robinson CG, Gach HM. Technical note: Minimizing CIED artifacts on a 0.35 T MRI-Linac using deep learning. J Appl Clin Med Phys 2024; 25:e14304. [PMID: 38368615 DOI: 10.1002/acm2.14304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 01/11/2024] [Accepted: 02/03/2024] [Indexed: 02/20/2024] Open
Abstract
BACKGROUND Artifacts from implantable cardioverter defibrillators (ICDs) are a challenge to magnetic resonance imaging (MRI)-guided radiotherapy (MRgRT). PURPOSE This study tested an unsupervised generative adversarial network to mitigate ICD artifacts in balanced steady-state free precession (bSSFP) cine MRIs and improve image quality and tracking performance for MRgRT. METHODS Fourteen healthy volunteers (Group A) were scanned on a 0.35 T MRI-Linac with and without an MR conditional ICD taped to their left pectoral to simulate an implanted ICD. bSSFP MRI data from 12 of the volunteers were used to train a CycleGAN model to reduce ICD artifacts. The data from the remaining two volunteers were used for testing. In addition, the dataset was reorganized three times using a Leave-One-Out scheme. Tracking metrics [Dice similarity coefficient (DSC), target registration error (TRE), and 95 percentile Hausdorff distance (95% HD)] were evaluated for whole-heart contours. Image quality metrics [normalized root mean square error (nRMSE), peak signal-to-noise ratio (PSNR), and multiscale structural similarity (MS-SSIM) scores] were evaluated. The technique was also tested qualitatively on three additional ICD datasets (Group B) including a patient with an implanted ICD. RESULTS For the whole-heart contour with CycleGAN reconstruction: 1) Mean DSC rose from 0.910 to 0.935; 2) Mean TRE dropped from 4.488 to 2.877 mm; and 3) Mean 95% HD dropped from 10.236 to 7.700 mm. For the whole-body slice with CycleGAN reconstruction: 1) Mean nRMSE dropped from 0.644 to 0.420; 2) Mean MS-SSIM rose from 0.779 to 0.819; and 3) Mean PSNR rose from 18.744 to 22.368. The three Group B datasets evaluated qualitatively displayed a reduction in ICD artifacts in the heart. CONCLUSION CycleGAN-generated reconstructions significantly improved both tracking and image quality metrics when used to mitigate artifacts from ICDs.
Collapse
Affiliation(s)
- Austen N Curcuru
- Department of Radiation Oncology, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Deshan Yang
- Department of Radiation Oncology, Duke University, Durham, North Carolina, USA
| | - Hongyu An
- Departments of Radiology, Biomedical Engineering and Neurology, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Phillip S Cuculich
- Departments of Cardiovascular Medicine and Radiation Oncology, Washington University in St. Louis, St. Louis, Missouri, USA
| | - Clifford G Robinson
- Department of Radiation Oncology, Washington University in St. Louis, St. Louis, Missouri, USA
| | - H Michael Gach
- Departments of Radiation Oncology, Radiology and Biomedical Engineering, Washington University in St. Louis, St. Louis, Missouri, USA
| |
Collapse
|
44
|
Wei K, Kong W, Liu L, Wang J, Li B, Zhao B, Li Z, Zhu J, Yu G. CT synthesis from MR images using frequency attention conditional generative adversarial network. Comput Biol Med 2024; 170:107983. [PMID: 38286104 DOI: 10.1016/j.compbiomed.2024.107983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 12/24/2023] [Accepted: 01/13/2024] [Indexed: 01/31/2024]
Abstract
Magnetic resonance (MR) image-guided radiotherapy is widely used in the treatment planning of malignant tumors, and MR-only radiotherapy, a representative of this technique, requires synthetic computed tomography (sCT) images for effective radiotherapy planning. Convolutional neural networks (CNN) have shown remarkable performance in generating sCT images. However, CNN-based models tend to synthesize more low-frequency components and the pixel-wise loss function usually used to optimize the model can result in blurred images. To address these problems, a frequency attention conditional generative adversarial network (FACGAN) is proposed in this paper. Specifically, a frequency cycle generative model (FCGM) is designed to enhance the inter-mapping between MR and CT and extract more rich tissue structure information. Additionally, a residual frequency channel attention (RFCA) module is proposed and incorporated into the generator to enhance its ability in perceiving the high-frequency image features. Finally, high-frequency loss (HFL) and cycle consistency high-frequency loss (CHFL) are added to the objective function to optimize the model training. The effectiveness of the proposed model is validated on pelvic and brain datasets and compared with state-of-the-art deep learning models. The results show that FACGAN produces higher-quality sCT images while retaining clearer and richer high-frequency texture information.
Collapse
Affiliation(s)
- Kexin Wei
- Shandong Key Laboratory of Medical Physics and Image Processing, Shandong Institute of Industrial Technology for Health Sciences and Precision Medicine, School of Physics and Electronics, Shandong Normal University, Jinan, China
| | - Weipeng Kong
- Shandong Key Laboratory of Medical Physics and Image Processing, Shandong Institute of Industrial Technology for Health Sciences and Precision Medicine, School of Physics and Electronics, Shandong Normal University, Jinan, China
| | - Liheng Liu
- Department of Radiology, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Jian Wang
- Department of Radiology, Central Hospital Affiliated to Shandong First Medical University, Jinan, China
| | - Baosheng Li
- Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, No.440, Jiyan Road, Jinan, 250117, Shandong Province, China
| | - Bo Zhao
- Shandong Key Laboratory of Medical Physics and Image Processing, Shandong Institute of Industrial Technology for Health Sciences and Precision Medicine, School of Physics and Electronics, Shandong Normal University, Jinan, China
| | - Zhenjiang Li
- Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, No.440, Jiyan Road, Jinan, 250117, Shandong Province, China
| | - Jian Zhu
- Shandong Cancer Hospital and Institute, Shandong First Medical University and Shandong Academy of Medical Sciences, No.440, Jiyan Road, Jinan, 250117, Shandong Province, China.
| | - Gang Yu
- Shandong Key Laboratory of Medical Physics and Image Processing, Shandong Institute of Industrial Technology for Health Sciences and Precision Medicine, School of Physics and Electronics, Shandong Normal University, Jinan, China.
| |
Collapse
|
45
|
Bao L, Zhang H, Liao Z. A spatially adaptive regularization based three-dimensional reconstruction network for quantitative susceptibility mapping. Phys Med Biol 2024; 69:045030. [PMID: 38286013 DOI: 10.1088/1361-6560/ad237f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 01/29/2024] [Indexed: 01/31/2024]
Abstract
Objective.Quantitative susceptibility mapping (QSM) is a new imaging technique for non-invasive characterization of the composition and microstructure ofin vivotissues, and it can be reconstructed from local field measurements by solving an ill-posed inverse problem. Even for deep learning networks, it is not an easy task to establish an accurate quantitative mapping between two physical quantities of different units, i.e. field shift in Hz and susceptibility value in ppm for QSM.Approach. In this paper, we propose a spatially adaptive regularization based three-dimensional reconstruction network SAQSM. A spatially adaptive module is specially designed and a set of them at different resolutions are inserted into the network decoder, playing a role of cross-modality based regularization constraint. Therefore, the exact information of both field and magnitude data is exploited to adjust the scale and shift of feature maps, and thus any information loss or deviation occurred in previous layers could be effectively corrected. The network encoding has a dynamic perceptual initialization, which enables the network to overcome receptive field intervals and also strengthens its ability to detect features of various sizes.Main results. Experimental results on the brain data of healthy volunteers, clinical hemorrhage and simulated phantom with calcification demonstrate that SAQSM can achieve more accurate reconstruction with less susceptibility artifacts, while perform well on the stability and generalization even for severe lesion areas.Significance. This proposed framework may provide a valuable paradigm to quantitative mapping or multimodal reconstruction.
Collapse
Affiliation(s)
- Lijun Bao
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Xiamen University, Xiamen 361005, People's Republic of China
| | - Hongyuan Zhang
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Xiamen University, Xiamen 361005, People's Republic of China
- Zhangzhou Institute of Science and Technology, Zhangzhou City, Fujian Province, People's Republic of China
| | - Zeyu Liao
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Xiamen University, Xiamen 361005, People's Republic of China
| |
Collapse
|
46
|
Boubnovski Martell M, Linton-Reid K, Hindocha S, Chen M, Moreno P, Álvarez-Benito M, Salvatierra Á, Lee R, Posma JM, Calzado MA, Aboagye EO. Deep representation learning of tissue metabolome and computed tomography annotates NSCLC classification and prognosis. NPJ Precis Oncol 2024; 8:28. [PMID: 38310164 PMCID: PMC10838282 DOI: 10.1038/s41698-024-00502-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 01/04/2024] [Indexed: 02/05/2024] Open
Abstract
The rich chemical information from tissue metabolomics provides a powerful means to elaborate tissue physiology or tumor characteristics at cellular and tumor microenvironment levels. However, the process of obtaining such information requires invasive biopsies, is costly, and can delay clinical patient management. Conversely, computed tomography (CT) is a clinical standard of care but does not intuitively harbor histological or prognostic information. Furthermore, the ability to embed metabolome information into CT to subsequently use the learned representation for classification or prognosis has yet to be described. This study develops a deep learning-based framework -- tissue-metabolomic-radiomic-CT (TMR-CT) by combining 48 paired CT images and tumor/normal tissue metabolite intensities to generate ten image embeddings to infer metabolite-derived representation from CT alone. In clinical NSCLC settings, we ascertain whether TMR-CT results in an enhanced feature generation model solving histology classification/prognosis tasks in an unseen international CT dataset of 742 patients. TMR-CT non-invasively determines histological classes - adenocarcinoma/squamous cell carcinoma with an F1-score = 0.78 and further asserts patients' prognosis with a c-index = 0.72, surpassing the performance of radiomics models and deep learning on single modality CT feature extraction. Additionally, our work shows the potential to generate informative biology-inspired CT-led features to explore connections between hard-to-obtain tissue metabolic profiles and routine lesion-derived image data.
Collapse
Affiliation(s)
| | | | - Sumeet Hindocha
- Early Diagnosis and Detection Centre, National Institute for Health and Care Research Biomedical Research Centre at the Royal Marsden and Institute of Cancer Research, London, SW3 6JJ, UK
| | - Mitchell Chen
- Imperial College London Hammersmith Campus, London, SW7 2AZ, UK
| | - Paula Moreno
- Instituto Maimónides de Investigación Biomédica de Córdoba (IMIBIC), Córdoba, 14004, Spain
- Departamento de Cirugía Toráxica y Trasplante de Pulmón, Hospital Universitario Reina Sofía, Córdoba, 14014, Spain
| | - Marina Álvarez-Benito
- Instituto Maimónides de Investigación Biomédica de Córdoba (IMIBIC), Córdoba, 14004, Spain
- Unidad de Radiodiagnóstico y Cáncer de Mama, Hospital Universitario Reina Sofía, Córdoba, 14004, Spain
| | - Ángel Salvatierra
- Instituto Maimónides de Investigación Biomédica de Córdoba (IMIBIC), Córdoba, 14004, Spain
- Unidad de Radiodiagnóstico y Cáncer de Mama, Hospital Universitario Reina Sofía, Córdoba, 14004, Spain
| | - Richard Lee
- Early Diagnosis and Detection Centre, National Institute for Health and Care Research Biomedical Research Centre at the Royal Marsden and Institute of Cancer Research, London, SW3 6JJ, UK
- National Heart and Lung Institute, Imperial College London, Guy Scadding Building, Dovehouse Street, London, SW3 6LY, UK
| | - Joram M Posma
- Imperial College London Hammersmith Campus, London, SW7 2AZ, UK
| | - Marco A Calzado
- Instituto Maimónides de Investigación Biomédica de Córdoba (IMIBIC), Córdoba, 14004, Spain.
- Departamento de Biología Celular, Fisiología e Inmunología, Universidad de Córdoba, Córdoba, 14014, Spain.
| | - Eric O Aboagye
- Imperial College London Hammersmith Campus, London, SW7 2AZ, UK.
| |
Collapse
|
47
|
Zhong L, Chen Z, Shu H, Zheng K, Li Y, Chen W, Wu Y, Ma J, Feng Q, Yang W. Multi-Scale Tokens-Aware Transformer Network for Multi-Region and Multi-Sequence MR-to-CT Synthesis in a Single Model. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:794-806. [PMID: 37782590 DOI: 10.1109/tmi.2023.3321064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/04/2023]
Abstract
The superiority of magnetic resonance (MR)-only radiotherapy treatment planning (RTP) has been well demonstrated, benefiting from the synthesis of computed tomography (CT) images which supplements electron density and eliminates the errors of multi-modal images registration. An increasing number of methods has been proposed for MR-to-CT synthesis. However, synthesizing CT images of different anatomical regions from MR images with different sequences using a single model is challenging due to the large differences between these regions and the limitations of convolutional neural networks in capturing global context information. In this paper, we propose a multi-scale tokens-aware Transformer network (MTT-Net) for multi-region and multi-sequence MR-to-CT synthesis in a single model. Specifically, we develop a multi-scale image tokens Transformer to capture multi-scale global spatial information between different anatomical structures in different regions. Besides, to address the limited attention areas of tokens in Transformer, we introduce a multi-shape window self-attention into Transformer to enlarge the receptive fields for learning the multi-directional spatial representations. Moreover, we adopt a domain classifier in generator to introduce the domain knowledge for distinguishing the MR images of different regions and sequences. The proposed MTT-Net is evaluated on a multi-center dataset and an unseen region, and remarkable performance was achieved with MAE of 69.33 ± 10.39 HU, SSIM of 0.778 ± 0.028, and PSNR of 29.04 ± 1.32 dB in head & neck region, and MAE of 62.80 ± 7.65 HU, SSIM of 0.617 ± 0.058 and PSNR of 25.94 ± 1.02 dB in abdomen region. The proposed MTT-Net outperforms state-of-the-art methods in both accuracy and visual quality.
Collapse
|
48
|
Dayarathna S, Islam KT, Uribe S, Yang G, Hayat M, Chen Z. Deep learning based synthesis of MRI, CT and PET: Review and analysis. Med Image Anal 2024; 92:103046. [PMID: 38052145 DOI: 10.1016/j.media.2023.103046] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 11/14/2023] [Accepted: 11/29/2023] [Indexed: 12/07/2023]
Abstract
Medical image synthesis represents a critical area of research in clinical decision-making, aiming to overcome the challenges associated with acquiring multiple image modalities for an accurate clinical workflow. This approach proves beneficial in estimating an image of a desired modality from a given source modality among the most common medical imaging contrasts, such as Computed Tomography (CT), Magnetic Resonance Imaging (MRI), and Positron Emission Tomography (PET). However, translating between two image modalities presents difficulties due to the complex and non-linear domain mappings. Deep learning-based generative modelling has exhibited superior performance in synthetic image contrast applications compared to conventional image synthesis methods. This survey comprehensively reviews deep learning-based medical imaging translation from 2018 to 2023 on pseudo-CT, synthetic MR, and synthetic PET. We provide an overview of synthetic contrasts in medical imaging and the most frequently employed deep learning networks for medical image synthesis. Additionally, we conduct a detailed analysis of each synthesis method, focusing on their diverse model designs based on input domains and network architectures. We also analyse novel network architectures, ranging from conventional CNNs to the recent Transformer and Diffusion models. This analysis includes comparing loss functions, available datasets and anatomical regions, and image quality assessments and performance in other downstream tasks. Finally, we discuss the challenges and identify solutions within the literature, suggesting possible future directions. We hope that the insights offered in this survey paper will serve as a valuable roadmap for researchers in the field of medical image synthesis.
Collapse
Affiliation(s)
- Sanuwani Dayarathna
- Department of Data Science and AI, Faculty of Information Technology, Monash University, Clayton VIC 3800, Australia.
| | | | - Sergio Uribe
- Department of Medical Imaging and Radiation Sciences, Faculty of Medicine, Monash University, Clayton VIC 3800, Australia
| | - Guang Yang
- Bioengineering Department and Imperial-X, Imperial College London, W12 7SL, United Kingdom
| | - Munawar Hayat
- Department of Data Science and AI, Faculty of Information Technology, Monash University, Clayton VIC 3800, Australia
| | - Zhaolin Chen
- Department of Data Science and AI, Faculty of Information Technology, Monash University, Clayton VIC 3800, Australia; Monash Biomedical Imaging, Clayton VIC 3800, Australia
| |
Collapse
|
49
|
Dubey G, Srivastava S, Jayswal AK, Saraswat M, Singh P, Memoria M. Fetal Ultrasound Segmentation and Measurements Using Appearance and Shape Prior Based Density Regression with Deep CNN and Robust Ellipse Fitting. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024; 37:247-267. [PMID: 38343234 PMCID: PMC10976955 DOI: 10.1007/s10278-023-00908-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 09/05/2023] [Accepted: 09/06/2023] [Indexed: 03/02/2024]
Abstract
Accurately segmenting the structure of the fetal head (FH) and performing biometry measurements, including head circumference (HC) estimation, stands as a vital requirement for addressing abnormal fetal growth during pregnancy under the expertise of experienced radiologists using ultrasound (US) images. However, accurate segmentation and measurement is a challenging task due to image artifact, incomplete ellipse fitting, and fluctuations due to FH dimensions over different trimesters. Also, it is highly time-consuming due to the absence of specialized features, which leads to low segmentation accuracy. To address these challenging tasks, we propose an automatic density regression approach to incorporate appearance and shape priors into the deep learning-based network model (DR-ASPnet) with robust ellipse fitting using fetal US images. Initially, we employed multiple pre-processing steps to remove unwanted distortions, variable fluctuations, and a clear view of significant features from the US images. Then some form of augmentation operation is applied to increase the diversity of the dataset. Next, we proposed the hierarchical density regression deep convolutional neural network (HDR-DCNN) model, which involves three network models to determine the complex location of FH for accurate segmentation during the training and testing processes. Then, we used post-processing operations using contrast enhancement filtering with a morphological operation model to smooth the region and remove unnecessary artifacts from the segmentation results. After post-processing, we applied the smoothed segmented result to the robust ellipse fitting-based least square (REFLS) method for HC estimation. Experimental results of the DR-ASPnet model obtain 98.86% dice similarity coefficient (DSC) as segmentation accuracy, and it also obtains 1.67 mm absolute distance (AD) as measurement accuracy compared to other state-of-the-art methods. Finally, we achieved a 0.99 correlation coefficient (CC) in estimating the measured and predicted HC values on the HC18 dataset.
Collapse
Affiliation(s)
- Gaurav Dubey
- Department of Computer Science, KIET Group of Institutions, Delhi-NCR, Ghaziabad, U.P, India
| | | | | | - Mala Saraswat
- Department of Computer Science, Bennett University, Greater Noida, India
| | - Pooja Singh
- Shiv Nadar University, Greater Noida, Uttar Pradesh, India
| | - Minakshi Memoria
- CSE Department, UIT, Uttaranchal University, Dehradun, Uttarakhand, India
| |
Collapse
|
50
|
Kim S, Yuan L, Kim S, Suh TS. Generation of tissues outside the field of view (FOV) of radiation therapy simulation imaging based on machine learning and patient body outline (PBO). Radiat Oncol 2024; 19:15. [PMID: 38273278 PMCID: PMC10811833 DOI: 10.1186/s13014-023-02384-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Accepted: 11/28/2023] [Indexed: 01/27/2024] Open
Abstract
BACKGROUND It is not unusual to see some parts of tissues are excluded in the field of view of CT simulation images. A typical mitigation is to avoid beams entering the missing body parts at the cost of sub-optimal planning. METHODS This study is to solve the problem by developing 3 methods, (1) deep learning (DL) mechanism for missing tissue generation, (2) using patient body outline (PBO) based on surface imaging, and (3) hybrid method combining DL and PBO. The DL model was built upon a Globally and Locally Consistent Image Completion to learn features by Convolutional Neural Networks-based inpainting, based on Generative Adversarial Network. The database used comprised 10,005 CT training slices of 322 lung cancer patients and 166 CT evaluation test slices of 15 patients. CT images were from the publicly available database of the Cancer Imaging Archive. Since existing data were used PBOs were acquired from the CT images. For evaluation, Structural Similarity Index Metric (SSIM), Root Mean Square Error (RMSE) and Peak signal-to-noise ratio (PSNR) were evaluated. For dosimetric validation, dynamic conformal arc plans were made with the ground truth images and images generated by the proposed method. Gamma analysis was conducted at relatively strict criteria of 1%/1 mm (dose difference/distance to agreement) and 2%/2 mm under three dose thresholds of 1%, 10% and 50% of the maximum dose in the plans made on the ground truth image sets. RESULTS The average SSIM in generation part only was 0.06 at epoch 100 but reached 0.86 at epoch 1500. Accordingly, the average SSIM in the whole image also improved from 0.86 to 0.97. At epoch 1500, the average values of RMSE and PSNR in the whole image were 7.4 and 30.9, respectively. Gamma analysis showed excellent agreement with the hybrid method (equal to or higher than 96.6% of the mean of pass rates for all scenarios). CONCLUSIONS It was first demonstrated that missing tissues in simulation imaging could be generated with high similarity, and dosimetric limitation could be overcome. The benefit of this study can be significantly enlarged when MR-only simulation is considered.
Collapse
Affiliation(s)
- Sunmi Kim
- Department of Biomedical Engineering and Research Institute of Biomedical Engineering, College of Medicine, The Catholic University of Korea, 222 Banpo-daero, Seocho-gu, Seoul, 06591, Republic of Korea
- Department of Radiation Oncology, Yonsei Cancer Center, Seoul, 03722, Republic of Korea
| | - Lulin Yuan
- Department of Radiation Oncology, School of Medicine, Virginia Commonwealth University, Richmond, VA, 23284, USA
| | - Siyong Kim
- Department of Radiation Oncology, School of Medicine, Virginia Commonwealth University, Richmond, VA, 23284, USA.
| | - Tae Suk Suh
- Department of Biomedical Engineering and Research Institute of Biomedical Engineering, College of Medicine, The Catholic University of Korea, 222 Banpo-daero, Seocho-gu, Seoul, 06591, Republic of Korea.
| |
Collapse
|