51
|
Zhou K, Shum HPH, Li FWB, Liang X. Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2024; 30:6754-6769. [PMID: 38032781 DOI: 10.1109/tvcg.2023.3337868] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/02/2023]
Abstract
In many human-computer interaction applications, fast and accurate hand tracking is necessary for an immersive experience. However, raw hand motion data can be flawed due to issues such as joint occlusions and high-frequency noise, hindering the interaction. Using only current motion for interaction can lead to lag, so predicting future movement is crucial for a faster response. Our solution is the Multi-task Spatial-Temporal Graph Auto-Encoder (Multi-STGAE), a model that accurately denoises and predicts hand motion by exploiting the inter-dependency of both tasks. The model ensures a stable and accurate prediction through denoising while maintaining motion dynamics to avoid over-smoothed motion and alleviate time delays through prediction. A gate mechanism is integrated to prevent negative transfer between tasks and further boost multi-task performance. Multi-STGAE also includes a spatial-temporal graph autoencoder block, which models hand structures and motion coherence through graph convolutional networks, reducing noise while preserving hand physiology. Additionally, we design a novel hand partition strategy and hand bone loss to improve natural hand motion generation. We validate the effectiveness of our proposed method by contributing two large-scale datasets with a data corruption algorithm based on two benchmark datasets. To evaluate the natural characteristics of the denoised and predicted hand motion, we propose two structural metrics. Experimental results show that our method outperforms the state-of-the-art, showcasing how the multi-task framework enables mutual benefits between denoising and prediction.
Collapse
|
52
|
Bousse A, Kandarpa VSS, Rit S, Perelli A, Li M, Wang G, Zhou J, Wang G. Systematic Review on Learning-based Spectral CT. ARXIV 2024:arXiv:2304.07588v9. [PMID: 37461421 PMCID: PMC10350100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 07/24/2023]
Abstract
Spectral computed tomography (CT) has recently emerged as an advanced version of medical CT and significantly improves conventional (single-energy) CT. Spectral CT has two main forms: dual-energy computed tomography (DECT) and photon-counting computed tomography (PCCT), which offer image improvement, material decomposition, and feature quantification relative to conventional CT. However, the inherent challenges of spectral CT, evidenced by data and image artifacts, remain a bottleneck for clinical applications. To address these problems, machine learning techniques have been widely applied to spectral CT. In this review, we present the state-of-the-art data-driven techniques for spectral CT.
Collapse
Affiliation(s)
| | | | - Simon Rit
- Univ. Lyon, INSA-Lyon, Université Claude Bernard Lyon 1, UJM-Saint Étienne, CNRS, Inserm, CREATIS UMR 5220, U1294, F-69373, Lyon, France
| | - Alessandro Perelli
- School of Science and Engineering, University of Dundee, DD1 4HN Dundee, U.K
| | - Mengzhou Li
- Biomedical Imaging Center, Rensselaer Polytechnic Institute, Troy, NY 12180 USA
| | - Guobao Wang
- Department of Radiology, University of California Davis Health, Sacramento, CA 95817 USA
| | - Jian Zhou
- CTIQ, Canon Medical Research USA, Inc., Vernon Hills, IL 60061 USA
| | - Ge Wang
- Biomedical Imaging Center, Rensselaer Polytechnic Institute, Troy, NY 12180 USA
| |
Collapse
|
53
|
Hein D, Holmin S, Szczykutowicz T, Maltz JS, Danielsson M, Wang G, Persson M. Noise suppression in photon-counting computed tomography using unsupervised Poisson flow generative models. Vis Comput Ind Biomed Art 2024; 7:24. [PMID: 39311990 PMCID: PMC11420411 DOI: 10.1186/s42492-024-00175-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 09/01/2024] [Indexed: 09/26/2024] Open
Abstract
Deep learning (DL) has proven to be important for computed tomography (CT) image denoising. However, such models are usually trained under supervision, requiring paired data that may be difficult to obtain in practice. Diffusion models offer unsupervised means of solving a wide range of inverse problems via posterior sampling. In particular, using the estimated unconditional score function of the prior distribution, obtained via unsupervised learning, one can sample from the desired posterior via hijacking and regularization. However, due to the iterative solvers used, the number of function evaluations (NFE) required may be orders of magnitudes larger than for single-step samplers. In this paper, we present a novel image denoising technique for photon-counting CT by extending the unsupervised approach to inverse problem solving to the case of Poisson flow generative models (PFGM)++. By hijacking and regularizing the sampling process we obtain a single-step sampler, that is NFE = 1. Our proposed method incorporates posterior sampling using diffusion models as a special case. We demonstrate that the added robustness afforded by the PFGM++ framework yields significant performance gains. Our results indicate competitive performance compared to popular supervised, including state-of-the-art diffusion-style models with NFE = 1 (consistency models), unsupervised, and non-DL-based image denoising techniques, on clinical low-dose CT data and clinical images from a prototype photon-counting CT system developed by GE HealthCare.
Collapse
Affiliation(s)
- Dennis Hein
- Department of Physics, KTH Royal Institute of Technology, Stockholm, 1142, Sweden.
- MedTechLabs, Karolinska University Hospital, Stockholm, 17164, Sweden.
| | - Staffan Holmin
- Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, 17164, Sweden
- Department of Neuroradiology, Karolinska University Hospital, Stockholm, 17164, Sweden
| | - Timothy Szczykutowicz
- Department of Radiology, School of Medicine and Public Health, University of Wisconsin, Madison, WI, 53705, United States
| | | | - Mats Danielsson
- Department of Physics, KTH Royal Institute of Technology, Stockholm, 1142, Sweden
- MedTechLabs, Karolinska University Hospital, Stockholm, 17164, Sweden
| | - Ge Wang
- Department of Biomedical Engineering, School of Engineering, Biomedical Imaging Center, Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, NY, 12180, United States
| | - Mats Persson
- Department of Physics, KTH Royal Institute of Technology, Stockholm, 1142, Sweden
- MedTechLabs, Karolinska University Hospital, Stockholm, 17164, Sweden
| |
Collapse
|
54
|
Son Y, Jeong S, Hong Y, Lee J, Jeon B, Choi H, Kim J, Shim H. Improvement in Image Quality of Low-Dose CT of Canines with Generative Adversarial Network of Anti-Aliasing Generator and Multi-Scale Discriminator. Bioengineering (Basel) 2024; 11:944. [PMID: 39329686 PMCID: PMC11428420 DOI: 10.3390/bioengineering11090944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2024] [Revised: 09/09/2024] [Accepted: 09/11/2024] [Indexed: 09/28/2024] Open
Abstract
Computed tomography (CT) imaging is vital for diagnosing and monitoring diseases in both humans and animals, yet radiation exposure remains a significant concern, especially in animal imaging. Low-dose CT (LDCT) minimizes radiation exposure but often compromises image quality due to a reduced signal-to-noise ratio (SNR). Recent advancements in deep learning, particularly with CycleGAN, offer promising solutions for denoising LDCT images, though challenges in preserving anatomical detail and image sharpness persist. This study introduces a novel framework tailored for animal LDCT imaging, integrating deep learning techniques within the CycleGAN architecture. Key components include BlurPool for mitigating high-resolution image distortion, PixelShuffle for enhancing expressiveness, hierarchical feature synthesis (HFS) networks for feature retention, and spatial channel squeeze excitation (scSE) blocks for contrast reproduction. Additionally, a multi-scale discriminator enhances detail assessment, supporting effective adversarial learning. Rigorous experimentation on veterinary CT images demonstrates our framework's superiority over traditional denoising methods, achieving significant improvements in noise reduction, contrast enhancement, and anatomical structure preservation. Extensive evaluations show that our method achieves a precision of 0.93 and a recall of 0.94. This validates our approach's efficacy, highlighting its potential to enhance diagnostic accuracy in veterinary imaging. We confirm the scSE method's critical role in optimizing performance, and robustness to input variations underscores its practical utility.
Collapse
Affiliation(s)
- Yuseong Son
- Department of Computer Engineering, Hankuk University of Foreign Studies, Seoul 02450, Republic of Korea
| | - Sihyeon Jeong
- Brain Korea 21 Project, Graduate School of Medical Science, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
- CONNECT-AI Research Center, Yonsei University College of Medicine, Seoul 03760, Republic of Korea
| | - Youngtaek Hong
- CONNECT-AI Research Center, Yonsei University College of Medicine, Seoul 03760, Republic of Korea
| | - Jina Lee
- Brain Korea 21 Project, Graduate School of Medical Science, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
- CONNECT-AI Research Center, Yonsei University College of Medicine, Seoul 03760, Republic of Korea
| | - Byunghwan Jeon
- Department of Computer Engineering, Hankuk University of Foreign Studies, Seoul 02450, Republic of Korea
| | - Hyunji Choi
- Department of Veterinary Radiology, College of Veterinary Medicine, Konkuk University, Seoul 05029, Republic of Korea
| | - Jaehwan Kim
- Department of Veterinary Radiology, College of Veterinary Medicine, Konkuk University, Seoul 05029, Republic of Korea
| | - Hackjoon Shim
- CONNECT-AI Research Center, Yonsei University College of Medicine, Seoul 03760, Republic of Korea
- Canon Medical Systems Korea Medical Imaging AI Research Center, Seoul 06173, Republic of Korea
| |
Collapse
|
55
|
Chakravarty A, Emre T, Leingang O, Riedl S, Mai J, Scholl HP, Sivaprasad S, Rueckert D, Lotery A, Schmidt-Erfurth U, Bogunović H. Morph-SSL: Self-Supervision With Longitudinal Morphing for Forecasting AMD Progression From OCT Volumes. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:3224-3239. [PMID: 38635383 PMCID: PMC7616690 DOI: 10.1109/tmi.2024.3390940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/20/2024]
Abstract
The lack of reliable biomarkers makes predicting the conversion from intermediate to neovascular age-related macular degeneration (iAMD, nAMD) a challenging task. We develop a Deep Learning (DL) model to predict the future risk of conversion of an eye from iAMD to nAMD from its current OCT scan. Although eye clinics generate vast amounts of longitudinal OCT scans to monitor AMD progression, only a small subset can be manually labeled for supervised DL. To address this issue, we propose Morph-SSL, a novel Self-supervised Learning (SSL) method for longitudinal data. It uses pairs of unlabelled OCT scans from different visits and involves morphing the scan from the previous visit to the next. The Decoder predicts the transformation for morphing and ensures a smooth feature manifold that can generate intermediate scans between visits through linear interpolation. Next, the Morph-SSL trained features are input to a Classifier which is trained in a supervised manner to model the cumulative probability distribution of the time to conversion with a sigmoidal function. Morph-SSL was trained on unlabelled scans of 399 eyes (3570 visits). The Classifier was evaluated with a five-fold cross-validation on 2418 scans from 343 eyes with clinical labels of the conversion date. The Morph-SSL features achieved an AUC of 0.779 in predicting the conversion to nAMD within the next 6 months, outperforming the same network when trained end-to-end from scratch or pre-trained with popular SSL methods. Automated prediction of the future risk of nAMD onset can enable timely treatment and individualized AMD management.
Collapse
Affiliation(s)
- Arunava Chakravarty
- Department of Ophthalmology and Optometry, Medical University of Vienna, 1090Vienna, Austria
| | - Taha Emre
- Department of Ophthalmology and Optometry, Medical University of Vienna, 1090Vienna, Austria
| | - Oliver Leingang
- Department of Ophthalmology and Optometry, Medical University of Vienna, 1090Vienna, Austria
| | - Sophie Riedl
- Department of Ophthalmology and Optometry, Medical University of Vienna, 1090Vienna, Austria
| | - Julia Mai
- Department of Ophthalmology and Optometry, Medical University of Vienna, 1090Vienna, Austria
| | - Hendrik P.N. Scholl
- Institute of Molecular and Clinical Ophthalmology Basel, 4031Basel, Switzerland, and also with the Department of Ophthalmology, University of Basel, 4001Basel, Switzerland
| | - Sobha Sivaprasad
- NIHR Moorfields Biomedical Research Centre, Moorfields Eye Hospital NHS Foundation Trust, EC1V 2PDLondon, U.K.
| | - Daniel Rueckert
- BioMedIA, Imperial College London, SW7 2AZLondon, U.K.; Institute for AI and Informatics in Medicine, Klinikum rechts der Isar, Technical University of Munich, 80333Munich, Germany
| | - Andrew Lotery
- Clinical and Experimental Sciences, Faculty of Medicine, University of Southampton, SO17 1BJSouthampton, U.K.
| | - Ursula Schmidt-Erfurth
- Department of Ophthalmology and Optometry, Medical University of Vienna, 1090Vienna, Austria
| | - Hrvoje Bogunović
- Department of Ophthalmology and Optometry and the Christian Doppler Laboratory for Artificial Intelligence in Retina, Medical University of Vienna, 1090Vienna, Austria
| |
Collapse
|
56
|
Li K, Li H, Anastasio MA. Investigating the use of signal detection information in supervised learning-based image denoising with consideration of task-shift. J Med Imaging (Bellingham) 2024; 11:055501. [PMID: 39247217 PMCID: PMC11376226 DOI: 10.1117/1.jmi.11.5.055501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 07/26/2024] [Accepted: 08/09/2024] [Indexed: 09/10/2024] Open
Abstract
Purpose Recently, learning-based denoising methods that incorporate task-relevant information into the training procedure have been developed to enhance the utility of the denoised images. However, this line of research is relatively new and underdeveloped, and some fundamental issues remain unexplored. Our purpose is to yield insights into general issues related to these task-informed methods. This includes understanding the impact of denoising on objective measures of image quality (IQ) when the specified task at inference time is different from that employed for model training, a phenomenon we refer to as "task-shift." Approach A virtual imaging test bed comprising a stylized computational model of a chest X-ray computed tomography imaging system was employed to enable a controlled and tractable study design. A canonical, fully supervised, convolutional neural network-based denoising method was purposely adopted to understand the underlying issues that may be relevant to a variety of applications and more advanced denoising or image reconstruction methods. Signal detection and signal detection-localization tasks under signal-known-statistically with background-known-statistically conditions were considered, and several distinct types of numerical observers were employed to compute estimates of the task performance. Studies were designed to reveal how a task-informed transfer-learning approach can influence the tradeoff between conventional and task-based measures of image quality within the context of the considered tasks. In addition, the impact of task-shift on these image quality measures was assessed. Results The results indicated that certain tradeoffs can be achieved such that the resulting AUC value was significantly improved and the degradation of physical IQ measures was statistically insignificant. It was also observed that introducing task-shift degrades the task performance as expected. The degradation was significant when a relatively simple task was considered for network training and observer performance on a more complex one was assessed at inference time. Conclusions The presented results indicate that the task-informed training method can improve the observer performance while providing control over the tradeoff between traditional and task-based measures of image quality. The behavior of a task-informed model fine-tuning procedure was demonstrated, and the impact of task-shift on task-based image quality measures was investigated.
Collapse
Affiliation(s)
- Kaiyan Li
- University of Illinois Urbana-Champaign, Department of Bioengineering, Urbana, Illinois, United States
| | - Hua Li
- University of Illinois Urbana-Champaign, Department of Bioengineering, Urbana, Illinois, United States
- Washington University School of Medicine in St. Louis, Department of Radiation Oncology, Saint Louis, Missouri, United States
| | - Mark A. Anastasio
- University of Illinois Urbana-Champaign, Department of Bioengineering, Urbana, Illinois, United States
| |
Collapse
|
57
|
Tunissen SAM, Moriakov N, Mikerov M, Smit EJ, Sechopoulos I, Teuwen J. Deep learning-based low-dose CT simulator for non-linear reconstruction methods. Med Phys 2024; 51:6046-6060. [PMID: 38843540 DOI: 10.1002/mp.17232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 04/17/2024] [Accepted: 05/16/2024] [Indexed: 10/19/2024] Open
Abstract
BACKGROUND Computer algorithms that simulate lower-doses computed tomography (CT) images from clinical-dose images are widely available. However, most operate in the projection domain and assume access to the reconstruction method. Access to commercial reconstruction methods may often not be available in medical research, making image-domain noise simulation methods useful. However, the introduction of non-linear reconstruction methods, such as iterative and deep learning-based reconstruction, makes noise insertion in the image domain intractable, as it is not possible to determine the noise textures analytically. PURPOSE To develop a deep learning-based image-domain method to generate low-dose CT images from clinical-dose CT (CDCT) images for non-linear reconstruction methods. METHODS We propose a fully image domain-based method, utilizing a series of three convolutional neural networks (CNNs), which, respectively, denoise CDCT images, predict the standard deviation map of the low-dose image, and generate the noise power spectra (NPS) of local patches throughout the low-dose image. All three models have U-net-based architectures and are partly or fully three-dimensional. As a use case for this study and with no loss of generality, we use paired low-dose and clinical-dose brain CT scans. A dataset of326 $\hskip.001pt 326$ paired scans was retrospectively obtained. All images were acquired with a wide-area detector clinical system and reconstructed using its standard clinical iterative algorithm. Each pair was registered using rigid registration to correct for motion between acquisitions. The data was randomly partitioned into training (251 $\hskip.001pt 251$ samples), validation (25 $\hskip.001pt 25$ samples), and test (50 $\hskip.001pt 50$ samples) sets. The performance of each of these three CNNs was validated separately. For the denoising CNN, the local standard deviation decrease, and bias were determined. For the standard deviation map CNN, the real and estimated standard deviations were compared locally. Finally, for the NPS CNN, the NPS of the synthetic and real low-dose noise were compared inside and outside the skull. Two proof-of-concept denoising studies were performed to determine if the performance of a CNN- or a gradient-based denoising filter on the synthetic low-dose data versus real data differed. RESULTS The denoising network had a median decrease in noise in the cerebrospinal fluid by a factor of1.71 $1.71$ and introduced a median bias of+ 0.7 $ + 0.7$ HU. The network for standard deviation map estimation had a median error of+ 0.1 $ + 0.1$ HU. The noise power spectrum estimation network was able to capture the anisotropic and shift-variant nature of the noise structure by showing good agreement between the synthetic and real low-dose noise and their corresponding power spectra. The two proof of concept denoising studies showed only minimal difference in standard deviation improvement ratio between the synthetic and real low-dose CT images with the median difference between the two being 0.0 and +0.05 for the CNN- and gradient-based filter, respectively. CONCLUSION The proposed method demonstrated good performance in generating synthetic low-dose brain CT scans without access to the projection data or to the reconstruction method. This method can generate multiple low-dose image realizations from one clinical-dose image, so it is useful for validation, optimization, and repeatability studies of image-processing algorithms.
Collapse
Affiliation(s)
| | - Nikita Moriakov
- Department of Medical Imaging, Radboudumc, Nijmegen, The Netherlands
- AI for Oncology Lab, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Mikhail Mikerov
- Department of Medical Imaging, Radboudumc, Nijmegen, The Netherlands
| | - Ewoud J Smit
- Department of Medical Imaging, Radboudumc, Nijmegen, The Netherlands
| | - Ioannis Sechopoulos
- Department of Medical Imaging, Radboudumc, Nijmegen, The Netherlands
- Dutch Expert Centre for Screening (LRCB), Nijmegen, The Netherlands
- Technical Medicine Centre, University of Twente, Enschede, The Netherlands
| | - Jonas Teuwen
- Department of Medical Imaging, Radboudumc, Nijmegen, The Netherlands
- AI for Oncology Lab, Netherlands Cancer Institute, Amsterdam, The Netherlands
- Depatment of Radiology, Memorial Sloan Kettering Cancer Center, New York, USA
| |
Collapse
|
58
|
Winfree T, McCollough C, Yu L. Development and validation of a noise insertion algorithm for photon-counting-detector CT. Med Phys 2024; 51:5943-5953. [PMID: 38923526 PMCID: PMC11489017 DOI: 10.1002/mp.17263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 05/30/2024] [Accepted: 05/31/2024] [Indexed: 06/28/2024] Open
Abstract
BACKGROUND Inserting noise into existing patient projection data to simulate lower-radiation-dose exams has been frequently used in traditional energy-integrating-detector (EID)-CT to optimize radiation dose in clinical protocols and to generate paired images for training deep-learning-based reconstruction and noise reduction methods. Recent introduction of photon counting detector CT (PCD-CT) also requires such a method to accomplish these tasks. However, clinical PCD-CT scanners often restrict the users access to the raw count data, exporting only the preprocessed, log-normalized sinogram. Therefore, it remains a challenge to employ projection domain noise insertion algorithms on PCD-CT. PURPOSE To develop and validate a projection domain noise insertion algorithm for PCD-CT that does not require access to the raw count data. MATERIALS AND METHODS A projection-domain noise model developed originally for EID-CT was adapted for PCD-CT. This model requires, as input, a map of the incident number of photons at each detector pixel when no object is in the beam. To obtain the map of incident number of photons, air scans were acquired on a PCD-CT scanner, then the noise equivalent photon number (NEPN) was calculated from the variance in the log normalized projection data of each scan. Additional air scans were acquired at various mA settings to investigate the impact of pulse pileup on the linearity of NEPN measurement. To validate the noise insertion algorithm, Noise Power Spectra (NPS) were generated from a 30 cm water tank scan and used to compare the noise texture and noise level of measured and simulated half dose and quarter dose images. An anthropomorphic thorax phantom was scanned with automatic exposure control, and noise levels at different slice locations were compared between simulated and measured half dose and quarter dose images. Spectral correlation between energy thresholds T1 and T2, and energy bins, B1 and B2, was compared between simulated and measured data across a wide range of tube current. Additionally, noise insertion was performed on a clinical patient case for qualitative assessment. RESULTS The NPS generated from simulated low dose water tank images showed similar shape and amplitude to that generated from the measured low dose images, differing by a maximum of 5.0% for half dose (HD) T1 images, 6.3% for HD T2 images, 4.1% for quarter dose (QD) T1 images, and 6.1% for QD T2 images. Noise versus slice measurements of the lung phantom showed comparable results between measured and simulated low dose images, with root mean square percent errors of 5.9%, 5.4%, 5.0%, and 4.6% for QD T1, HD T1, QD T2, and HD T2, respectively. NEPN measurements in air were linear up until 112 mA, after which pulse pileup effects significantly distort the air scan NEPN profile. Spectral correlation between T1 and T2 in simulation agreed well with that in the measured data in typical dose ranges. CONCLUSIONS A projection-domain noise insertion algorithm was developed and validated for PCD-CT to synthesize low-dose images from existing scans. It can be used for optimizing scanning protocols and generating paired images for training deep-learning-based methods.
Collapse
Affiliation(s)
| | | | - Lifeng Yu
- Department of Radiology, Mayo Clinic, Rochester, MN, US
| |
Collapse
|
59
|
Liu X, Xiang C, Lan L, Li C, Xiao H, Liu Z. Lesion region inpainting: an approach for pseudo-healthy image synthesis in intracranial infection imaging. Front Microbiol 2024; 15:1453870. [PMID: 39224212 PMCID: PMC11368058 DOI: 10.3389/fmicb.2024.1453870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Accepted: 08/05/2024] [Indexed: 09/04/2024] Open
Abstract
The synthesis of pseudo-healthy images, involving the generation of healthy counterparts for pathological images, is crucial for data augmentation, clinical disease diagnosis, and understanding pathology-induced changes. Recently, Generative Adversarial Networks (GANs) have shown substantial promise in this domain. However, the heterogeneity of intracranial infection symptoms caused by various infections complicates the model's ability to accurately differentiate between pathological and healthy regions, leading to the loss of critical information in healthy areas and impairing the precise preservation of the subject's identity. Moreover, for images with extensive lesion areas, the pseudo-healthy images generated by these methods often lack distinct organ and tissue structures. To address these challenges, we propose a three-stage method (localization, inpainting, synthesis) that achieves nearly perfect preservation of the subject's identity through precise pseudo-healthy synthesis of the lesion region and its surroundings. The process begins with a Segmentor, which identifies the lesion areas and differentiates them from healthy regions. Subsequently, a Vague-Filler fills the lesion areas to construct a healthy outline, thereby preventing structural loss in cases of extensive lesions. Finally, leveraging this healthy outline, a Generative Adversarial Network integrated with a contextual residual attention module generates a more realistic and clearer image. Our method was validated through extensive experiments across different modalities within the BraTS2021 dataset, achieving a healthiness score of 0.957. The visual quality of the generated images markedly exceeded those produced by competing methods, with enhanced capabilities in repairing large lesion areas. Further testing on the COVID-19-20 dataset showed that our model could effectively partially reconstruct images of other organs.
Collapse
Affiliation(s)
- Xiaojuan Liu
- College of Artificial Intelligence, Chongqing University of Technology, Chongqing, China
- College of Big Data and Intelligent Engineering, Chongqing College of International Business and Economics, Chongqing, China
| | - Cong Xiang
- College of Artificial Intelligence, Chongqing University of Technology, Chongqing, China
| | - Libin Lan
- College of Computer Science and Engineering, Chongqing University of Technology, Chongqing, China
| | - Chuan Li
- College of Big Data and Intelligent Engineering, Chongqing College of International Business and Economics, Chongqing, China
| | - Hanguang Xiao
- College of Artificial Intelligence, Chongqing University of Technology, Chongqing, China
| | - Zhi Liu
- College of Artificial Intelligence, Chongqing University of Technology, Chongqing, China
| |
Collapse
|
60
|
Wu J, Jiang X, Zhong L, Zheng W, Li X, Lin J, Li Z. Linear diffusion noise boosted deep image prior for unsupervised sparse-view CT reconstruction. Phys Med Biol 2024; 69:165029. [PMID: 39119998 DOI: 10.1088/1361-6560/ad69f7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2024] [Accepted: 07/31/2024] [Indexed: 08/10/2024]
Abstract
Objective.Deep learning has markedly enhanced the performance of sparse-view computed tomography reconstruction. However, the dependence of these methods on supervised training using high-quality paired datasets, and the necessity for retraining under varied physical acquisition conditions, constrain their generalizability across new imaging contexts and settings.Approach.To overcome these limitations, we propose an unsupervised approach grounded in the deep image prior framework. Our approach advances beyond the conventional single noise level input by incorporating multi-level linear diffusion noise, significantly mitigating the risk of overfitting. Furthermore, we embed non-local self-similarity as a deep implicit prior within a self-attention network structure, improving the model's capability to identify and utilize repetitive patterns throughout the image. Additionally, leveraging imaging physics, gradient backpropagation is performed between the image domain and projection data space to optimize network weights.Main Results.Evaluations with both simulated and clinical cases demonstrate our method's effective zero-shot adaptability across various projection views, highlighting its robustness and flexibility. Additionally, our approach effectively eliminates noise and streak artifacts while significantly restoring intricate image details.Significance. Our method aims to overcome the limitations in current supervised deep learning-based sparse-view CT reconstruction, offering improved generalizability and adaptability without the need for extensive paired training data.
Collapse
Affiliation(s)
- Jia Wu
- School of Communications and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, People's Republic of China
- School of Medical Information and Engineering, Southwest Medical University, Luzhou 646000, People's Republic of China
| | - Xiaoming Jiang
- Chongqing Engineering Research Center of Medical Electronics and Information Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, People's Republic of China
| | - Lisha Zhong
- School of Medical Information and Engineering, Southwest Medical University, Luzhou 646000, People's Republic of China
| | - Wei Zheng
- Key Laboratory of Big Data Intelligent Computing, Chongqing University of Posts and Telecommunications, Chongqing 400065, People's Republic of China
| | - Xinwei Li
- Chongqing Engineering Research Center of Medical Electronics and Information Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, People's Republic of China
| | - Jinzhao Lin
- School of Communications and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, People's Republic of China
| | - Zhangyong Li
- Chongqing Engineering Research Center of Medical Electronics and Information Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, People's Republic of China
| |
Collapse
|
61
|
Chen Z, Hu B, Niu C, Chen T, Li Y, Shan H, Wang G. IQAGPT: computed tomography image quality assessment with vision-language and ChatGPT models. Vis Comput Ind Biomed Art 2024; 7:20. [PMID: 39101954 DOI: 10.1186/s42492-024-00171-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Accepted: 07/24/2024] [Indexed: 08/06/2024] Open
Abstract
Large language models (LLMs), such as ChatGPT, have demonstrated impressive capabilities in various tasks and attracted increasing interest as a natural language interface across many domains. Recently, large vision-language models (VLMs) that learn rich vision-language correlation from image-text pairs, like BLIP-2 and GPT-4, have been intensively investigated. However, despite these developments, the application of LLMs and VLMs in image quality assessment (IQA), particularly in medical imaging, remains unexplored. This is valuable for objective performance evaluation and potential supplement or even replacement of radiologists' opinions. To this end, this study introduces IQAGPT, an innovative computed tomography (CT) IQA system that integrates image-quality captioning VLM with ChatGPT to generate quality scores and textual reports. First, a CT-IQA dataset comprising 1,000 CT slices with diverse quality levels is professionally annotated and compiled for training and evaluation. To better leverage the capabilities of LLMs, the annotated quality scores are converted into semantically rich text descriptions using a prompt template. Second, the image-quality captioning VLM is fine-tuned on the CT-IQA dataset to generate quality descriptions. The captioning model fuses image and text features through cross-modal attention. Third, based on the quality descriptions, users verbally request ChatGPT to rate image-quality scores or produce radiological quality reports. Results demonstrate the feasibility of assessing image quality using LLMs. The proposed IQAGPT outperformed GPT-4 and CLIP-IQA, as well as multitask classification and regression models that solely rely on images.
Collapse
Affiliation(s)
- Zhihao Chen
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, 200433, China
| | - Bin Hu
- Department of Radiology, Huashan Hospital, Fudan University, Shanghai, 200040, China
| | - Chuang Niu
- Biomedical Imaging Center, Center for Biotechnology and Interdisciplinary Studies, Department of Biomedical Engineering, Rensselaer Polytechnic Institute, Troy, NY, 12180, US
| | - Tao Chen
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, 200433, China
| | - Yuxin Li
- Department of Radiology, Huashan Hospital, Fudan University, Shanghai, 200040, China.
| | - Hongming Shan
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, 200433, China.
- MOE Frontiers Center for Brain Science, Fudan University, Shanghai, 200032, China.
- Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence (Ministry of Education), Fudan University, Shanghai, 200433, China.
| | - Ge Wang
- Biomedical Imaging Center, Center for Biotechnology and Interdisciplinary Studies, Department of Biomedical Engineering, Rensselaer Polytechnic Institute, Troy, NY, 12180, US.
| |
Collapse
|
62
|
Ge R, Fang Z, Wei P, Chen Z, Jiang H, Elazab A, Li W, Wan X, Zhang S, Wang C. UWAFA-GAN: Ultra-Wide-Angle Fluorescein Angiography Transformation via Multi-Scale Generation and Registration Enhancement. IEEE J Biomed Health Inform 2024; 28:4820-4829. [PMID: 38683721 DOI: 10.1109/jbhi.2024.3394597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2024]
Abstract
Fundus photography, in combination with the ultra-wide-angle fundus (UWF) techniques, becomes an indispensable diagnostic tool in clinical settings by offering a more comprehensive view of the retina. Nonetheless, UWF fluorescein angiography (UWF-FA) necessitates the administration of a fluorescent dye via injection into the patient's hand or elbow unlike UWF scanning laser ophthalmoscopy (UWF-SLO). To mitigate potential adverse effects associated with injections, researchers have proposed the development of cross-modality medical image generation algorithms capable of converting UWF-SLO images into their UWF-FA counterparts. Current image generation techniques applied to fundus photography encounter difficulties in producing high-resolution retinal images, particularly in capturing minute vascular lesions. To address these issues, we introduce a novel conditional generative adversarial network (UWAFA-GAN) to synthesize UWF-FA from UWF-SLO. This approach employs multi-scale generators and an attention transmit module to efficiently extract both global structures and local lesions. Additionally, to counteract the image blurriness issue that arises from training with misaligned data, a registration module is integrated within this framework. Our method performs non-trivially on inception scores and details generation. Clinical user studies further indicate that the UWF-FA images generated by UWAFA-GAN are clinically comparable to authentic images in terms of diagnostic reliability. Empirical evaluations on our proprietary UWF image datasets elucidate that UWAFA-GAN outperforms extant methodologies.
Collapse
|
63
|
Cam RM, Villa U, Anastasio MA. Learning a stable approximation of an existing but unknown inverse mapping: application to the half-time circular Radon transform. INVERSE PROBLEMS 2024; 40:085002. [PMID: 38933410 PMCID: PMC11197394 DOI: 10.1088/1361-6420/ad4f0a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 04/05/2024] [Accepted: 05/22/2024] [Indexed: 06/28/2024]
Abstract
Supervised deep learning-based methods have inspired a new wave of image reconstruction methods that implicitly learn effective regularization strategies from a set of training data. While they hold potential for improving image quality, they have also raised concerns regarding their robustness. Instabilities can manifest when learned methods are applied to find approximate solutions to ill-posed image reconstruction problems for which a unique and stable inverse mapping does not exist, which is a typical use case. In this study, we investigate the performance of supervised deep learning-based image reconstruction in an alternate use case in which a stable inverse mapping is known to exist but is not yet analytically available in closed form. For such problems, a deep learning-based method can learn a stable approximation of the unknown inverse mapping that generalizes well to data that differ significantly from the training set. The learned approximation of the inverse mapping eliminates the need to employ an implicit (optimization-based) reconstruction method and can potentially yield insights into the unknown analytic inverse formula. The specific problem addressed is image reconstruction from a particular case of radially truncated circular Radon transform (CRT) data, referred to as 'half-time' measurement data. For the half-time image reconstruction problem, we develop and investigate a learned filtered backprojection method that employs a convolutional neural network to approximate the unknown filtering operation. We demonstrate that this method behaves stably and readily generalizes to data that differ significantly from training data. The developed method may find application to wave-based imaging modalities that include photoacoustic computed tomography.
Collapse
Affiliation(s)
- Refik Mert Cam
- Department of Electrical and Computer Engineering, University of Illinois Urbana–Champaign, Urbana, IL 61801, United States of America
| | - Umberto Villa
- Oden Institute for Computational Engineering & Sciences, The University of Texas at Austin, Austin, TX 78712, United States of America
| | - Mark A Anastasio
- Department of Electrical and Computer Engineering, University of Illinois Urbana–Champaign, Urbana, IL 61801, United States of America
- Department of Bioengineering, University of Illinois Urbana–Champaign, Urbana, IL 61801, United States of America
| |
Collapse
|
64
|
Liao P, Zhang X, Wu Y, Chen H, Du W, Liu H, Yang H, Zhang Y. Weakly supervised low-dose computed tomography denoising based on generative adversarial networks. Quant Imaging Med Surg 2024; 14:5571-5590. [PMID: 39144020 PMCID: PMC11320552 DOI: 10.21037/qims-24-68] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Accepted: 06/17/2024] [Indexed: 08/16/2024]
Abstract
Background Low-dose computed tomography (LDCT) is a diagnostic imaging technique designed to minimize radiation exposure to the patient. However, this reduction in radiation may compromise computed tomography (CT) image quality, adversely impacting clinical diagnoses. Various advanced LDCT methods have emerged to mitigate this challenge, relying on well-matched LDCT and normal-dose CT (NDCT) image pairs for training. Nevertheless, these methods often face difficulties in distinguishing image details from nonuniformly distributed noise, limiting their denoising efficacy. Additionally, acquiring suitably paired datasets in the medical domain poses challenges, further constraining their applicability. Hence, the objective of this study was to develop an innovative denoising framework for LDCT images employing unpaired data. Methods In this paper, we propose a LDCT denoising network (DNCNN) that alleviates the need for aligning LDCT and NDCT images. Our approach employs generative adversarial networks (GANs) to learn and model the noise present in LDCT images, establishing a mapping from the pseudo-LDCT to the actual NDCT domain without the need for paired CT images. Results Within the domain of weakly supervised methods, our proposed model exhibited superior objective metrics on the simulated dataset when compared to CycleGAN and selective kernel-based cycle-consistent GAN (SKFCycleGAN): the peak signal-to-noise ratio (PSNR) was 43.9441, the structural similarity index measure (SSIM) was 0.9660, and the visual information fidelity (VIF) was 0.7707. In the clinical dataset, we conducted a visual effect analysis by observing various tissues through different observation windows. Our proposed method achieved a no-reference structural sharpness (NRSS) value of 0.6171, which was closest to that of the NDCT images (NRSS =0.6049), demonstrating its superiority over other denoising techniques in preserving details, maintaining structural integrity, and enhancing edge contrast. Conclusions Through extensive experiments on both simulated and clinical datasets, we demonstrated the superior efficacy of our proposed method in terms of denoising quality and quantity. Our method exhibits superiority over both supervised techniques, including block-matching and 3D filtering (BM3D), residual encoder-decoder convolutional neural network (RED-CNN), and Wasserstein generative adversarial network-VGG (WGAN-VGG), and over weakly supervised approaches, including CycleGAN and SKFCycleGAN.
Collapse
Affiliation(s)
- Peixi Liao
- Department of Stomatology, The Sixth People’s Hospital of Chengdu, Chengdu, China
| | - Xucan Zhang
- The National Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University, Chengdu, China
| | - Yaoyao Wu
- The School of Computer Science, Sichuan University, Chengdu, China
| | - Hu Chen
- The College of Computer Science, Sichuan University, Chengdu, China
| | - Wenchao Du
- The College of Computer Science, Sichuan University, Chengdu, China
| | - Hong Liu
- The College of Computer Science, Sichuan University, Chengdu, China
| | - Hongyu Yang
- The College of Computer Science, Sichuan University, Chengdu, China
| | - Yi Zhang
- The School of Cyber Science and Engineering, Sichuan University, Chengdu, China
| |
Collapse
|
65
|
Chi J, Sun Z, Tian S, Wang H, Wang S. A Hybrid Framework of Dual-Domain Signal Restoration and Multi-depth Feature Reinforcement for Low-Dose Lung CT Denoising. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024; 37:1944-1959. [PMID: 38424278 PMCID: PMC11300419 DOI: 10.1007/s10278-023-00934-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 09/05/2023] [Accepted: 09/06/2023] [Indexed: 03/02/2024]
Abstract
Low-dose computer tomography (LDCT) has been widely used in medical diagnosis. Various denoising methods have been presented to remove noise in LDCT scans. However, existing methods cannot achieve satisfactory results due to the difficulties in (1) distinguishing the characteristics of structures, textures, and noise confused in the image domain, and (2) representing local details and global semantics in the hierarchical features. In this paper, we propose a novel denoising method consisting of (1) a 2D dual-domain restoration framework to reconstruct noise-free structure and texture signals separately, and (2) a 3D multi-depth reinforcement U-Net model to further recover image details with enhanced hierarchical features. In the 2D dual-domain restoration framework, the convolutional neural networks are adopted in both the image domain where the image structures are well preserved through the spatial continuity, and the sinogram domain where the textures and noise are separately represented by different wavelet coefficients and processed adaptively. In the 3D multi-depth reinforcement U-Net model, the hierarchical features from the 3D U-Net are enhanced by the cross-resolution attention module (CRAM) and dual-branch graph convolution module (DBGCM). The CRAM preserves local details by integrating adjacent low-level features with different resolutions, while the DBGCM enhances global semantics by building graphs for high-level features in intra-feature and inter-feature dimensions. Experimental results on the LUNA16 dataset and 2016 NIH-AAPM-Mayo Clinic LDCT Grand Challenge dataset illustrate the proposed method outperforms the state-of-the-art methods on removing noise from LDCT images with clear structures and textures, proving its potential in clinical practice.
Collapse
Affiliation(s)
- Jianning Chi
- Faculty of Robot Science and Engineering, Northeastern University, Zhihui Street, Shenyang, 110169, Liaoning, China.
- Key Laboratory of Intelligent Computing in Medical Image of Ministry of Education, Northeastern University, Zhihui Street, Shenyang, 110169, Liaoning, China.
| | - Zhiyi Sun
- Faculty of Robot Science and Engineering, Northeastern University, Zhihui Street, Shenyang, 110169, Liaoning, China
| | - Shuyu Tian
- Graduate School, Dalian Medical University, Lyushunnan, Dalian, 116000, Liaoning, China
| | - Huan Wang
- Faculty of Robot Science and Engineering, Northeastern University, Zhihui Street, Shenyang, 110169, Liaoning, China
| | - Siqi Wang
- Faculty of Robot Science and Engineering, Northeastern University, Zhihui Street, Shenyang, 110169, Liaoning, China
| |
Collapse
|
66
|
Du W, Cui H, He L, Chen H, Zhang Y, Yang H. Structure-aware diffusion for low-dose CT imaging. Phys Med Biol 2024; 69:155008. [PMID: 38942004 DOI: 10.1088/1361-6560/ad5d47] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Accepted: 06/28/2024] [Indexed: 06/30/2024]
Abstract
Reducing the radiation dose leads to the x-ray computed tomography (CT) images suffering from heavy noise and artifacts, which inevitably interferes with the subsequent clinic diagnostic and analysis. Leading works have explored diffusion models for low-dose CT imaging to avoid the structure degeneration and blurring effects of previous deep denoising models. However, most of them always begin their generative processes with Gaussian noise, which has little or no structure priors of the clean data distribution, thereby leading to long-time inference and unpleasant reconstruction quality. To alleviate these problems, this paper presents a Structure-Aware Diffusion model (SAD), an end-to-end self-guided learning framework for high-fidelity CT image reconstruction. First, SAD builds a nonlinear diffusion bridge between clean and degraded data distributions, which could directly learn the implicit physical degradation prior from observed measurements. Second, SAD integrates the prompt learning mechanism and implicit neural representation into the diffusion process, where rich and diverse structure representations extracted by degraded inputs are exploited as prompts, which provides global and local structure priors, to guide CT image reconstruction. Finally, we devise an efficient self-guided diffusion architecture using an iterative updated strategy, which further refines structural prompts during each generative step to drive finer image reconstruction. Extensive experiments on AAPM-Mayo and LoDoPaB-CT datasets demonstrate that our SAD could achieve superior performance in terms of noise removal, structure preservation, and blind-dose generalization, with few generative steps, even one step only.
Collapse
Affiliation(s)
- Wenchao Du
- College of Computer Science, Sichuan University, Chengdu 610065, People's Republic of China
| | - HuanHuan Cui
- West China Hospital of Sichuan University, Chengdu 610041, People's Republic of China
| | - LinChao He
- College of Computer Science, Sichuan University, Chengdu 610065, People's Republic of China
| | - Hu Chen
- College of Computer Science, Sichuan University, Chengdu 610065, People's Republic of China
| | - Yi Zhang
- College of Computer Science, Sichuan University, Chengdu 610065, People's Republic of China
| | - Hongyu Yang
- College of Computer Science, Sichuan University, Chengdu 610065, People's Republic of China
| |
Collapse
|
67
|
Zhang K, Niu T, Xu L. DeCoGAN: MVCT image denoising via coupled generative adversarial network. Phys Med Biol 2024; 69:145007. [PMID: 38979700 DOI: 10.1088/1361-6560/ad5d4c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Accepted: 06/28/2024] [Indexed: 07/10/2024]
Abstract
Objective.In helical tomotherapy, image-guided radiotherapy employs megavoltage computed tomography (MVCT) for precise targeting. However, the high voltage of megavoltage radiation introduces substantial noise, significantly compromising MVCT image clarity. This study aims to enhance MVCT image quality using a deep learning-based denoising method.Approach.We propose an unpaired MVCT denoising network using a coupled generative adversarial network framework (DeCoGAN). Our approach assumes that a universal latent code within a shared latent space can reconstruct any given pair of images. By employing an encoder, we enforce this shared-latent space constraint, facilitating the conversion of low-quality (noisy) MVCT images into high-quality (denoised) counterparts. The network learns the joint distribution of images from both domains by leveraging samples from their respective marginal distributions, enhanced by adversarial training for effective denoising.Main Results.Compared to an analytical algorithm (BM3D) and three deep learning-based methods (RED-CNN, WGAN-VGG and CycleGAN), the proposed method excels in preserving image details and enhancing human visual perception by removing most noise and retaining structural features. Quantitative analysis demonstrates that our method achieves the highest peak signal-to-noise ratio and Structural Similarity Index Measurement values, indicating superior denoising performance.Significance.The proposed DeCoGAN method shows remarkable MVCT denoising performance, making it a promising tool in the field of radiation therapy.
Collapse
Affiliation(s)
- Kunpeng Zhang
- Department of Radiation Oncology, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, People's Republic of China
| | - Tianye Niu
- Institute of Biomedical Engineering, Shenzhen Bay Laboratory, Shenzhen, Guangdong, People's Republic of China
- Peking University Aerospace School of Clinical Medicine, Aerospace Center Hospital, Beijing, People's Republic of China
| | - Lei Xu
- Department of Radiation Oncology, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi, People's Republic of China
| |
Collapse
|
68
|
Baker RR, Muthurangu V, Rega M, Walsh SB, Steeden JA. Rapid 2D 23Na MRI of the calf using a denoising convolutional neural network. Magn Reson Imaging 2024; 110:184-194. [PMID: 38642779 DOI: 10.1016/j.mri.2024.04.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 04/12/2024] [Accepted: 04/17/2024] [Indexed: 04/22/2024]
Abstract
PURPOSE 23Na MRI can be used to quantify in-vivo tissue sodium concentration (TSC), but the inherently low 23Na signal leads to long scan times and/or noisy or low-resolution images. Reconstruction algorithms such as compressed sensing (CS) have been proposed to mitigate low signal-to-noise ratio (SNR); although, these can result in unnatural images, suboptimal denoising and long processing times. Recently, machine learning has been increasingly used to denoise 1H MRI acquisitions; however, this approach typically requires large volumes of high-quality training data, which is not readily available for 23Na MRI. Here, we propose using 1H data to train a denoising convolutional neural network (CNN), which we subsequently demonstrate on prospective 23Na images of the calf. METHODS 1893 1H fat-saturated transverse slices of the knee from the open-source fastMRI dataset were used to train denoising CNNs for different levels of noise. Synthetic low SNR images were generated by adding gaussian noise to the high-quality 1H k-space data before reconstruction to create paired training data. For prospective testing, 23Na images of the calf were acquired in 10 healthy volunteers with a total of 150 averages over ten minutes, which were used as a reference throughout the study. From this data, images with fewer averages were retrospectively reconstructed using a non-uniform fast Fourier transform (NUFFT) as well as CS, with the NUFFT images subsequently denoised using the trained CNN. RESULTS CNNs were successfully applied to 23Na images reconstructed with 50, 40 and 30 averages. Muscle and skin apparent TSC quantification from CNN-denoised images were equivalent to those from CS images, with <0.9 mM bias compared to reference values. Estimated SNR was significantly higher in CNN-denoised images compared to NUFFT, CS and reference images. Quantitative edge sharpness was equivalent for all images. For subjective image quality ranking, CNN-denoised images ranked equally best with reference images and significantly better than NUFFT and CS images. CONCLUSION Denoising CNNs trained on 1H data can be successfully applied to 23Na images of the calf; thus, allowing scan time to be reduced from ten minutes to two minutes with little impact on image quality or apparent TSC quantification accuracy.
Collapse
Affiliation(s)
- Rebecca R Baker
- UCL Centre for Medical Imaging, University College London, London, UK; UCL Centre for Translational Cardiovascular Imaging, University College London, London, UK.
| | - Vivek Muthurangu
- UCL Centre for Translational Cardiovascular Imaging, University College London, London, UK.
| | - Marilena Rega
- Institute of Nuclear Medicine, University College Hospital, London, UK.
| | - Stephen B Walsh
- Department of Renal Medicine, University College London, London, UK.
| | - Jennifer A Steeden
- UCL Centre for Translational Cardiovascular Imaging, University College London, London, UK.
| |
Collapse
|
69
|
Yin Z, Wu P, Manohar A, McVeigh ER, Pack JD. Protocol optimization for functional cardiac CT imaging using noise emulation in the raw data domain. Med Phys 2024; 51:4622-4634. [PMID: 38753583 PMCID: PMC11547861 DOI: 10.1002/mp.17088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 03/29/2024] [Indexed: 05/18/2024] Open
Abstract
BACKGROUND Four-dimensional (4D) wide coverage computed tomography (CT) is an effective imaging modality for measuring the mechanical function of the myocardium. However, repeated CT measurement across a number of heartbeats is still a concern. PURPOSE A projection-domain noise emulation method is presented to generate accurate low-dose (mA modulated) 4D cardiac CT scans from high-dose scans, enabling protocol optimization to deliver sufficient image quality for functional cardiac analysis while using a dose level that is as low as reasonably achievable (ALARA). METHODS Given a targeted low-dose mA modulation curve, the proposed noise emulation method injects both quantum and electronic noise of proper magnitude and correlation to the high-dose data in projection domain. A spatially varying (i.e., channel-dependent) detector gain term as well as its calibration method were proposed to further improve the noise emulation accuracy. To determine the ALARA dose threshold, a straightforward projection domain image quality (IQ) metric was proposed that is based on the number of projection rays that do not fall under the non-linear region of the detector response. Experiments were performed to validate the noise emulation method with both phantom and clinical data in terms of visual similarity, contrast-to-noise ratio (CNR), and noise-power spectrum (NPS). RESULTS For both phantom and clinical data, the low-dose emulated images exhibited similar noise magnitude (CNR difference within 2%), artifacts, and texture to that of the real low-dose images. The proposed channel-dependent detector gain term resulted in additional increase in emulation accuracy. Using the proposed IQ metric, recommended kVp and mA settings were calculated for low dose 4D Cardiac CT acquisitions for patients of different sizes. CONCLUSIONS A detailed method to estimate system-dependent parameters for a raw-data based low dose emulation framework was described. The method produced realistic noise levels, artifacts, and texture with phantom and clinical studies. The proposed low-dose emulation method can be used to prospectively select patient-specific minimal-dose protocols for functional cardiac CT.
Collapse
Affiliation(s)
- Zhye Yin
- GE HealthCare, Waukesha, WI, USA
| | - Pengwei Wu
- GE HealthCare Technology & Innovation Center, Niskayuna, NY, USA
| | - Ashish Manohar
- Dept. of Medicine, Stanford University, Palo Alto, CA, USA
| | - Elliot R. McVeigh
- Dept. of Bioengineering, Medicine, Radiology at University of California San Diego, San Diego, CA, USA
| | - Jed D. Pack
- GE HealthCare Technology & Innovation Center, Niskayuna, NY, USA
| |
Collapse
|
70
|
Tsanda A, Nickisch H, Wissel T, Klinder T, Knopp T, Grass M. Dose robustness of deep learning models for anatomic segmentation of computed tomography images. J Med Imaging (Bellingham) 2024; 11:044005. [PMID: 39099642 PMCID: PMC11293838 DOI: 10.1117/1.jmi.11.4.044005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 04/23/2024] [Accepted: 07/10/2024] [Indexed: 08/06/2024] Open
Abstract
Purpose The trend towards lower radiation doses and advances in computed tomography (CT) reconstruction may impair the operation of pretrained segmentation models, giving rise to the problem of estimating the dose robustness of existing segmentation models. Previous studies addressing the issue suffer either from a lack of registered low- and full-dose CT images or from simplified simulations. Approach We employed raw data from full-dose acquisitions to simulate low-dose CT scans, avoiding the need to rescan a patient. The accuracy of the simulation is validated using a real CT scan of a phantom. We consider down to 20% reduction of radiation dose, for which we measure deviations of several pretrained segmentation models from the full-dose prediction. In addition, compatibility with existing denoising methods is considered. Results The results reveal the surprising robustness of the TotalSegmentator approach, showing minimal differences at the pixel level even without denoising. Less robust models show good compatibility with the denoising methods, which help to improve robustness in almost all cases. With denoising based on a convolutional neural network (CNN), the median Dice between low- and full-dose data does not fall below 0.9 (12 for the Hausdorff distance) for all but one model. We observe volatile results for labels with effective radii less than 19 mm and improved results for contrasted CT acquisitions. Conclusion The proposed approach facilitates clinically relevant analysis of dose robustness for human organ segmentation models. The results outline the robustness properties of a diverse set of models. Further studies are needed to identify the robustness of approaches for lesion segmentation and to rank the factors contributing to dose robustness.
Collapse
Affiliation(s)
- Artyom Tsanda
- Hamburg University of Technology, Institute for Biomedical Imaging, Hamburg, Germany
- Philips Innovative Technologies, Hamburg, Germany
| | | | | | | | - Tobias Knopp
- Hamburg University of Technology, Institute for Biomedical Imaging, Hamburg, Germany
- University Medical Center Hamburg-Eppendorf, Section for Biomedical Imaging, Hamburg, Germany
| | | |
Collapse
|
71
|
Chaudhary MFA, Gerard SE, Christensen GE, Cooper CB, Schroeder JD, Hoffman EA, Reinhardt JM. LungViT: Ensembling Cascade of Texture Sensitive Hierarchical Vision Transformers for Cross-Volume Chest CT Image-to-Image Translation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:2448-2465. [PMID: 38373126 PMCID: PMC11227912 DOI: 10.1109/tmi.2024.3367321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]
Abstract
Chest computed tomography (CT) at inspiration is often complemented by an expiratory CT to identify peripheral airways disease. Additionally, co-registered inspiratory-expiratory volumes can be used to derive various markers of lung function. Expiratory CT scans, however, may not be acquired due to dose or scan time considerations or may be inadequate due to motion or insufficient exhale; leading to a missed opportunity to evaluate underlying small airways disease. Here, we propose LungViT- a generative adversarial learning approach using hierarchical vision transformers for translating inspiratory CT intensities to corresponding expiratory CT intensities. LungViT addresses several limitations of the traditional generative models including slicewise discontinuities, limited size of generated volumes, and their inability to model texture transfer at volumetric level. We propose a shifted-window hierarchical vision transformer architecture with squeeze-and-excitation decoder blocks for modeling dependencies between features. We also propose a multiview texture similarity distance metric for texture and style transfer in 3D. To incorporate global information into the training process and refine the output of our model, we use ensemble cascading. LungViT is able to generate large 3D volumes of size 320×320×320 . We train and validate our model using a diverse cohort of 1500 subjects with varying disease severity. To assess model generalizability beyond the development set biases, we evaluate our model on an out-of-distribution external validation set of 200 subjects. Clinical validation on internal and external testing sets shows that synthetic volumes could be reliably adopted for deriving clinical endpoints of chronic obstructive pulmonary disease.
Collapse
|
72
|
Komolafe TE, Zhou L, Zhao W, Wang N, Wu T. EDRAM-Net: Encoder-Decoder with Residual Attention Module Network for Low-dose Computed Tomography Reconstruction. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2024; 2024:1-6. [PMID: 40039194 DOI: 10.1109/embc53108.2024.10781702] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
The medical application of Computed Tomography (CT) is to provide detailed anatomical structures of patients without the need for invasive procedures like surgery, which is very useful for clinicians in disease diagnosis. Excessive radiation exposure can lead to the development of cancers. It is of great importance to reduce this radiation exposure by using low-dose CT (LDCT) acquisition, which is effective, but reconstructed CT images tend to be degraded, leading to the loss of vital information which is one of the most significant drawbacks of this technique. In the past few years, multiscale convolutional networks (MSCN) have been widely adopted in LDCT reconstruction to preserve vital details in reconstructed images. Based on this inspiration, we proposed an encoder-decoder network with a residual attention module (EDRAM-Net) for LDCT reconstruction. The proposed EDRAM-Net embeds the cascaded residual attention module (RAM) block into the skip connection connecting the encoder-decoder architecture. Specifically, the encoder captures and encodes details in the latent space, which is reconstructed in the decoder of the network. The RAM blocks consist of three modules: the MSCN, channel attention module (CAN), and spatial attention module (SAM). The MSCN captures features at different scales, while the CAM and SAM focus on channel and spatial details during reconstruction. The performance of EDRAM-Net evaluated on the public AAPM low-dose dataset shows that the method has improved performance in terms of estimated image quality metric compared to other comparative methods. The ablation study further revealed that using the kernel size of (7×7) for the RAM block significantly enhanced the performance of our model. It was also observed that a higher number of RAM blocks yielded improved performance but at the expense of computational complexity.
Collapse
|
73
|
Li G, Deng Z, Ge Y, Luo S. HEAL: High-Frequency Enhanced and Attention-Guided Learning Network for Sparse-View CT Reconstruction. Bioengineering (Basel) 2024; 11:646. [PMID: 39061728 PMCID: PMC11273693 DOI: 10.3390/bioengineering11070646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 06/08/2024] [Accepted: 06/18/2024] [Indexed: 07/28/2024] Open
Abstract
X-ray computed tomography (CT) imaging technology has become an indispensable diagnostic tool in clinical examination. However, it poses a risk of ionizing radiation, making the reduction of radiation dose one of the current research hotspots in CT imaging. Sparse-view imaging, as one of the main methods for reducing radiation dose, has made significant progress in recent years. In particular, sparse-view reconstruction methods based on deep learning have shown promising results. Nevertheless, efficiently recovering image details under ultra-sparse conditions remains a challenge. To address this challenge, this paper proposes a high-frequency enhanced and attention-guided learning Network (HEAL). HEAL includes three optimization strategies to achieve detail enhancement: Firstly, we introduce a dual-domain progressive enhancement module, which leverages fidelity constraints within each domain and consistency constraints across domains to effectively narrow the solution space. Secondly, we incorporate both channel and spatial attention mechanisms to improve the network's feature-scaling process. Finally, we propose a high-frequency component enhancement regularization term that integrates residual learning with direction-weighted total variation, utilizing directional cues to effectively distinguish between noise and textures. The HEAL network is trained, validated and tested under different ultra-sparse configurations of 60 views and 30 views, demonstrating its advantages in reconstruction accuracy and detail enhancement.
Collapse
Affiliation(s)
- Guang Li
- Jiangsu Key Laboratory for Biomaterials and Devices, School of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, China; (G.L.); (Z.D.)
| | - Zhenhao Deng
- Jiangsu Key Laboratory for Biomaterials and Devices, School of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, China; (G.L.); (Z.D.)
| | - Yongshuai Ge
- Research Center for Medical Artificial Intelligence, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- Paul C Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- Key Laboratory of Biomedical Imaging Science and System, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Shouhua Luo
- Jiangsu Key Laboratory for Biomaterials and Devices, School of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, China; (G.L.); (Z.D.)
| |
Collapse
|
74
|
彭 声, 王 永, 边 兆, 马 建, 黄 静. [A dual-domain cone beam computed tomography reconstruction framework with improved differentiable domain transform for cone-angle artifact correction]. NAN FANG YI KE DA XUE XUE BAO = JOURNAL OF SOUTHERN MEDICAL UNIVERSITY 2024; 44:1188-1197. [PMID: 38977350 PMCID: PMC11237300 DOI: 10.12122/j.issn.1673-4254.2024.06.21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Indexed: 07/10/2024]
Abstract
OBJECTIVE We propose a dual-domain cone beam computed tomography (CBCT) reconstruction framework DualCBR-Net based on improved differentiable domain transform for cone-angle artifact correction. METHODS The proposed CBCT dual-domain reconstruction framework DualCBR-Net consists of 3 individual modules: projection preprocessing, differentiable domain transform, and image post-processing. The projection preprocessing module first extends the original projection data in the row direction to ensure full coverage of the scanned object by X-ray. The differentiable domain transform introduces the FDK reconstruction and forward projection operators to complete the forward and gradient backpropagation processes, where the geometric parameters correspond to the extended data dimension to provide crucial prior information in the forward pass of the network and ensure the accuracy in the gradient backpropagation, thus enabling precise learning of cone-beam region data. The image post-processing module further fine-tunes the domain-transformed image to remove residual artifacts and noises. RESULTS The results of validation experiments conducted on Mayo's public chest dataset showed that the proposed DualCBR-Net framework was superior to other comparison methods in terms of artifact removal and structural detail preservation. Compared with the latest methods, the DualCBR-Net framework improved the PSNR and SSIM by 0.6479 and 0.0074, respectively. CONCLUSION The proposed DualCBR-Net framework for cone-angle artifact correction allows effective joint training of the CBCT dual-domain network and is especially effective for large cone-angle region.
Collapse
|
75
|
Yao L, Wang J, Wu Z, Du Q, Yang X, Li M, Zheng J. Parallel processing model for low-dose computed tomography image denoising. Vis Comput Ind Biomed Art 2024; 7:14. [PMID: 38865022 PMCID: PMC11169366 DOI: 10.1186/s42492-024-00165-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Accepted: 05/20/2024] [Indexed: 06/13/2024] Open
Abstract
Low-dose computed tomography (LDCT) has gained increasing attention owing to its crucial role in reducing radiation exposure in patients. However, LDCT-reconstructed images often suffer from significant noise and artifacts, negatively impacting the radiologists' ability to accurately diagnose. To address this issue, many studies have focused on denoising LDCT images using deep learning (DL) methods. However, these DL-based denoising methods have been hindered by the highly variable feature distribution of LDCT data from different imaging sources, which adversely affects the performance of current denoising models. In this study, we propose a parallel processing model, the multi-encoder deep feature transformation network (MDFTN), which is designed to enhance the performance of LDCT imaging for multisource data. Unlike traditional network structures, which rely on continual learning to process multitask data, the approach can simultaneously handle LDCT images within a unified framework from various imaging sources. The proposed MDFTN consists of multiple encoders and decoders along with a deep feature transformation module (DFTM). During forward propagation in network training, each encoder extracts diverse features from its respective data source in parallel and the DFTM compresses these features into a shared feature space. Subsequently, each decoder performs an inverse operation for multisource loss estimation. Through collaborative training, the proposed MDFTN leverages the complementary advantages of multisource data distribution to enhance its adaptability and generalization. Numerous experiments were conducted on two public datasets and one local dataset, which demonstrated that the proposed network model can simultaneously process multisource data while effectively suppressing noise and preserving fine structures. The source code is available at https://github.com/123456789ey/MDFTN .
Collapse
Affiliation(s)
- Libing Yao
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China
- Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Jiping Wang
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China
- Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Zhongyi Wu
- Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China.
| | - Qiang Du
- Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Xiaodong Yang
- Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| | - Ming Li
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China.
- Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China.
| | - Jian Zheng
- School of Biomedical Engineering (Suzhou), Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230026, China
- Medical Imaging Department, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou, 215163, China
| |
Collapse
|
76
|
Ko Y, Song S, Baek J, Shim H. Adapting low-dose CT denoisers for texture preservation using zero-shot local noise-level matching. Med Phys 2024; 51:4181-4200. [PMID: 38478305 DOI: 10.1002/mp.17015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 01/27/2024] [Accepted: 01/28/2024] [Indexed: 06/05/2024] Open
Abstract
BACKGROUND On enhancing the image quality of low-dose computed tomography (LDCT), various denoising methods have achieved meaningful improvements. However, they commonly produce over-smoothed results; the denoised images tend to be more blurred than the normal-dose targets (NDCTs). Furthermore, many recent denoising methods employ deep learning(DL)-based models, which require a vast amount of CT images (or image pairs). PURPOSE Our goal is to address the problem of over-smoothed results and design an algorithm that works regardless of the need for a large amount of training dataset to achieve plausible denoising results. Over-smoothed images negatively affect the diagnosis and treatment since radiologists had developed clinical experiences with NDCT. Besides, a large-scale training dataset is often not available in clinical situations. To overcome these limitations, we propose locally-adaptive noise-level matching (LANCH), emphasizing the output should retain the same noise-level and characteristics to that of the NDCT without additional training. METHODS We represent the NDCT image as the pixel-wisely weighted sum of an over-smoothed output from off-the-shelf denoiser (OSD) and the difference between the LDCT image and the OSD output. Herein, LANCH determines a 2D ratio map (i.e., pixel-wise weight matrix) by locally matching the noise-level of output and NDCT, where the LDCT-to-NDCT device flux (mAs) ratio reveals the NDCT noise-level. Thereby, LANCH can preserve important details in LDCT, and enhance the sharpness of the noise-free regions. Note that LANCH can enhance any LDCT denoisers without additional training data (i.e., zero-shot). RESULTS The proposed method is applicable to any OSD denoisers, reporting significant texture plausibility development over the baseline denoisers in quantitative and qualitative manners. It is surprising that the denoising accuracy achieved by our method with zero-shot denoiser was comparable or superior to that of the best training-based denoisers; our result showed 1% and 33% gains in terms of SSIM and DISTS, respectively. Reader study with experienced radiologists shows significant image quality improvements, a gain of + 1.18 on a five-point mean opinion score scale. CONCLUSIONS In this paper, we propose a technique to enhance any low-dose CT denoiser by leveraging the fundamental physical relationship between the x-ray flux and noise variance. Our method is capable of operating in a zero-shot condition, which means that only a single low-dose CT image is required for the enhancement process. We demonstrate that our approach is comparable or even superior to supervised DL-based denoisers that are trained using numerous CT images. Extensive experiments illustrate that our method consistently improves the performance of all tested LDCT denoisers.
Collapse
Affiliation(s)
- Youngjun Ko
- School of Integrated Technology, Yonsei University, Incheon, South Korea
| | - Seongjong Song
- School of Integrated Technology, Yonsei University, Incheon, South Korea
| | - Jongduk Baek
- School of Integrated Technology, Yonsei University, Incheon, South Korea
| | | |
Collapse
|
77
|
Zakeri A, Hokmabadi A, Nix MG, Gooya A, Wijesinghe I, Taylor ZA. 4D-Precise: Learning-based 3D motion estimation and high temporal resolution 4DCT reconstruction from treatment 2D+t X-ray projections. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 250:108158. [PMID: 38604010 DOI: 10.1016/j.cmpb.2024.108158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 03/23/2024] [Accepted: 03/29/2024] [Indexed: 04/13/2024]
Abstract
BACKGROUND AND OBJECTIVE In radiotherapy treatment planning, respiration-induced motion introduces uncertainty that, if not appropriately considered, could result in dose delivery problems. 4D cone-beam computed tomography (4D-CBCT) has been developed to provide imaging guidance by reconstructing a pseudo-motion sequence of CBCT volumes through binning projection data into breathing phases. However, it suffers from artefacts and erroneously characterizes the averaged breathing motion. Furthermore, conventional 4D-CBCT can only be generated post-hoc using the full sequence of kV projections after the treatment is complete, limiting its utility. Hence, our purpose is to develop a deep-learning motion model for estimating 3D+t CT images from treatment kV projection series. METHODS We propose an end-to-end learning-based 3D motion modelling and 4DCT reconstruction model named 4D-Precise, abbreviated from Probabilistic reconstruction of image sequences from CBCT kV projections. The model estimates voxel-wise motion fields and simultaneously reconstructs a 3DCT volume at any arbitrary time point of the input projections by transforming a reference CT volume. Developing a Torch-DRR module, it enables end-to-end training by computing Digitally Reconstructed Radiographs (DRRs) in PyTorch. During training, DRRs with matching projection angles to the input kVs are automatically extracted from reconstructed volumes and their structural dissimilarity to inputs is penalised. We introduced a novel loss function to regulate spatio-temporal motion field variations across the CT scan, leveraging planning 4DCT for prior motion distribution estimation. RESULTS The model is trained patient-specifically using three kV scan series, each including over 1200 angular/temporal projections, and tested on three other scan series. Imaging data from five patients are analysed here. Also, the model is validated on a simulated paired 4DCT-DRR dataset created using the Surrogate Parametrised Respiratory Motion Modelling (SuPReMo). The results demonstrate that the reconstructed volumes by 4D-Precise closely resemble the ground-truth volumes in terms of Dice, volume similarity, mean contour distance, and Hausdorff distance, whereas 4D-Precise achieves smoother deformations and fewer negative Jacobian determinants compared to SuPReMo. CONCLUSIONS Unlike conventional 4DCT reconstruction techniques that ignore breath inter-cycle motion variations, the proposed model computes both intra-cycle and inter-cycle motions. It represents motion over an extended timeframe, covering several minutes of kV scan series.
Collapse
Affiliation(s)
- Arezoo Zakeri
- Centre for Computational Imaging and Simulation Technologies in Biomedicine, School of Computing, University of Leeds, Leeds, UK.
| | - Alireza Hokmabadi
- Department of Infection, Immunity & Cardio Disease, University of Sheffield, Sheffield, UK
| | - Michael G Nix
- Leeds Cancer Centre, Leeds Teaching Hospitals NHS Trust, UK
| | - Ali Gooya
- School of Computing Science, University of Glasgow, Glasgow, UK; Alan Turing Institute, London, UK
| | - Isuru Wijesinghe
- Centre for Computational Imaging and Simulation Technologies in Biomedicine, School of Mechanical Engineering, University of Leeds, Leeds, UK
| | - Zeike A Taylor
- Centre for Computational Imaging and Simulation Technologies in Biomedicine, School of Mechanical Engineering, University of Leeds, Leeds, UK.
| |
Collapse
|
78
|
Broll A, Goldhacker M, Hahnel S, Rosentritt M. Generative deep learning approaches for the design of dental restorations: A narrative review. J Dent 2024; 145:104988. [PMID: 38608832 DOI: 10.1016/j.jdent.2024.104988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 03/13/2024] [Accepted: 04/03/2024] [Indexed: 04/14/2024] Open
Abstract
OBJECTIVES This study aims to explore and discuss recent advancements in tooth reconstruction utilizing deep learning (DL) techniques. A review on new DL methodologies in partial and full tooth reconstruction is conducted. DATA/SOURCES PubMed, Google Scholar, and IEEE Xplore databases were searched for articles from 2003 to 2023. STUDY SELECTION The review includes 9 articles published from 2018 to 2023. The selected articles showcase novel DL approaches for tooth reconstruction, while those concentrating solely on the application or review of DL methods are excluded. The review shows that data is acquired via intraoral scans or laboratory scans of dental plaster models. Common data representations are depth maps, point clouds, and voxelized point clouds. Reconstructions focus on single teeth, using data from adjacent teeth or the entire jaw. Some articles include antagonist teeth data and features like occlusal grooves and gap distance. Primary network architectures include Generative Adversarial Networks (GANs) and Transformers. Compared to conventional digital methods, DL-based tooth reconstruction reports error rates approximately two times lower. CONCLUSIONS Generative DL models analyze dental datasets to reconstruct missing teeth by extracting insights into patterns and structures. Through specialized application, these models reconstruct morphologically and functionally sound dental structures, leveraging information from the existing teeth. The reported advancements facilitate the feasibility of DL-based dental crown reconstruction. Beyond GANs and Transformers with point clouds or voxels, recent studies indicate promising outcomes with diffusion-based architectures and innovative data representations like wavelets for 3D shape completion and inference problems. CLINICAL SIGNIFICANCE Generative network architectures employed in the analysis and reconstruction of dental structures demonstrate notable proficiency. The enhanced accuracy and efficiency of DL-based frameworks hold the potential to enhance clinical outcomes and increase patient satisfaction. The reduced reconstruction times and diminished requirement for manual intervention may lead to cost savings and improved accessibility of dental services.
Collapse
Affiliation(s)
- Alexander Broll
- Department of Prosthetic Dentistry, University Hospital Regensburg, Regensburg, Germany
| | - Markus Goldhacker
- Faculty of Mechanical Engineering, OTH Regensburg, Regensburg, Germany
| | - Sebastian Hahnel
- Department of Prosthetic Dentistry, University Hospital Regensburg, Regensburg, Germany
| | - Martin Rosentritt
- Department of Prosthetic Dentistry, University Hospital Regensburg, Regensburg, Germany
| |
Collapse
|
79
|
Meng M, Wang Y, Zhu M, Tao X, Mao Z, Liao J, Bian Z, Zeng D, Ma J. DDT-Net: Dose-Agnostic Dual-Task Transfer Network for Simultaneous Low-Dose CT Denoising and Simulation. IEEE J Biomed Health Inform 2024; 28:3613-3625. [PMID: 38478459 DOI: 10.1109/jbhi.2024.3376628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/20/2024]
Abstract
Deep learning (DL) algorithms have achieved unprecedented success in low-dose CT (LDCT) imaging and are expected to be a new generation of CT reconstruction technology. However, most DL-based denoising models often lack the ability to generalize to unseen dose data. Moreover, most simulation tools for LDCT typically operate on proprietary projection data, which is generally not accessible without an established collaboration with CT manufacturers. To alleviate these issues, in this work, we propose a dose-agnostic dual-task transfer network, termed DDT-Net, for simultaneous LDCT denoising and simulation. Concretely, the dual-task learning module is constructed to integrate the LDCT denoising and simulation tasks into a unified optimization framework by learning the joint distribution of LDCT and NDCT data. We approximate the joint distribution of continuous dose level data by training DDT-Net with discrete dose data, which can be generalized to denoising and simulation of unseen dose data. In particular, the mixed-dose training strategy adopted by DDT-Net can promote the denoising performance of lower-dose data. The paired dataset simulated by DDT-Net can be used for data augmentation to further restore the tissue texture of LDCT images. Experimental results on synthetic data and clinical data show that the proposed DDT-Net outperforms competing methods in terms of denoising and generalization performance at unseen dose data, and it also provides a simulation tool that can quickly simulate realistic LDCT images at arbitrary dose levels.
Collapse
|
80
|
Kimura Y, Suyama TQ, Shimamura Y, Suzuki J, Watanabe M, Igei H, Otera Y, Kaneko T, Suzukawa M, Matsui H, Kudo H. Subjective and objective image quality of low-dose CT images processed using a self-supervised denoising algorithm. Radiol Phys Technol 2024; 17:367-374. [PMID: 38413510 DOI: 10.1007/s12194-024-00786-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 01/11/2024] [Accepted: 01/25/2024] [Indexed: 02/29/2024]
Abstract
This study aimed to assess the subjective and objective image quality of low-dose computed tomography (CT) images processed using a self-supervised denoising algorithm with deep learning. We trained the self-supervised denoising model using low-dose CT images of 40 patients and applied this model to CT images of another 30 patients. Image quality, in terms of noise and edge sharpness, was rated on a 5-point scale by two radiologists. The coefficient of variation, contrast-to-noise ratio (CNR), and signal-to-noise ratio (SNR) were calculated. The values for the self-supervised denoising model were compared with those for the original low-dose CT images and CT images processed using other conventional denoising algorithms (non-local means, block-matching and 3D filtering, and total variation minimization-based algorithms). The mean (standard deviation) scores of local and overall noise levels for the self-supervised denoising algorithm were 3.90 (0.40) and 3.93 (0.51), respectively, outperforming the original image and other algorithms. Similarly, the mean scores of local and overall edge sharpness for the self-supervised denoising algorithm were 3.90 (0.40) and 3.75 (0.47), respectively, surpassing the scores of the original image and other algorithms. The CNR and SNR for the self-supervised denoising algorithm were higher than those for the original images but slightly lower than those for the other algorithms. Our findings indicate the potential clinical applicability of the self-supervised denoising algorithm for low-dose CT images in clinical settings.
Collapse
Affiliation(s)
- Yuya Kimura
- Clinical Research Center, National Hospital Organization Tokyo National Hospital, Tokyo, Japan.
- Department of Clinical Epidemiology and Health Economics, School of Public Health, University of Tokyo, Tokyo, Japan.
| | - Takeru Q Suyama
- Nadogaya Research Institute, Nadogaya Hospital, Chiba, Japan
| | | | - Jun Suzuki
- Department of Respiratory Medicine, National Hospital Organization Tokyo Hospital, Tokyo, Japan
- Department of Radiology, Saitama Medical University International Medical Center, Saitama, Japan
| | - Masato Watanabe
- Department of Respiratory Medicine, National Hospital Organization Tokyo Hospital, Tokyo, Japan
| | - Hiroshi Igei
- Department of Respiratory Medicine, National Hospital Organization Tokyo Hospital, Tokyo, Japan
| | - Yuya Otera
- Department of Radiology, National Hospital Organization Tokyo Hospital, Tokyo, Japan
| | - Takayuki Kaneko
- Radiological Physics and Technology Department, National Center for Global Health and Medicine, Tokyo, Japan
| | - Maho Suzukawa
- Clinical Research Center, National Hospital Organization Tokyo National Hospital, Tokyo, Japan
| | - Hirotoshi Matsui
- Department of Respiratory Medicine, National Hospital Organization Tokyo Hospital, Tokyo, Japan
| | - Hiroyuki Kudo
- Institute of Systems and Information Engineering, University of Tsukuba, Ibaraki, Japan
| |
Collapse
|
81
|
Zhang Y, Zhang R, Cao R, Xu F, Jiang F, Meng J, Ma F, Guo Y, Liu J. Unsupervised low-dose CT denoising using bidirectional contrastive network. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 251:108206. [PMID: 38723435 DOI: 10.1016/j.cmpb.2024.108206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Revised: 04/16/2024] [Accepted: 04/29/2024] [Indexed: 05/31/2024]
Abstract
BACKGROUND AND OBJECTIVE Low-dose computed tomography (LDCT) scans significantly reduce radiation exposure, but introduce higher levels of noise and artifacts that compromise image quality and diagnostic accuracy. Supervised learning methods have proven effective in denoising LDCT images, but are hampered by the need for large, paired datasets, which pose significant challenges in data acquisition. This study aims to develop a robust unsupervised LDCT denoising method that overcomes the reliance on paired LDCT and normal-dose CT (NDCT) samples, paving the way for more accessible and practical denoising techniques. METHODS We propose a novel unsupervised network model, Bidirectional Contrastive Unsupervised Denoising (BCUD), for LDCT denoising. This model innovatively combines a bidirectional network structure with contrastive learning theory to map the precise mutual correspondence between the noisy LDCT image domain and the clean NDCT image domain. Specifically, we employ dual encoders and discriminators for domain-specific data generation, and use unique projection heads for each domain to adaptively learn customized embedded representations. We then align corresponding features across domains within the learned embedding spaces to achieve effective noise reduction. This approach fundamentally improves the model's ability to match features in latent space, thereby improving noise reduction while preserving fine image detail. RESULTS Through extensive experimental validation on the AAPM-Mayo public dataset and real-world clinical datasets, the proposed BCUD method demonstrated superior performance. It achieved a peak signal-to-noise ratio (PSNR) of 31.387 dB, a structural similarity index measure (SSIM) of 0.886, an information fidelity criterion (IFC) of 2.305, and a visual information fidelity (VIF) of 0.373. Notably, subjective evaluation by radiologists resulted in a mean score of 4.23, highlighting its advantages over existing methods in terms of clinical applicability. CONCLUSIONS This paper presents an innovative unsupervised LDCT denoising method using a bidirectional contrastive network, which greatly improves clinical applicability by eliminating the need for perfectly matched image pairs. The method sets a new benchmark in unsupervised LDCT image denoising, excelling in noise reduction and preservation of fine structural details.
Collapse
Affiliation(s)
- Yuanke Zhang
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; Shandong Provincial Key Laboratory of Data Security and Intelligent Computing, Qufu Normal University, Rizhao 276826, China.
| | - Rui Zhang
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Rujuan Cao
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Fan Xu
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Fengjuan Jiang
- School of Computer Science, Qufu Normal University, Rizhao 276826, China
| | - Jing Meng
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; Shandong Provincial Key Laboratory of Data Security and Intelligent Computing, Qufu Normal University, Rizhao 276826, China
| | - Fei Ma
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; Shandong Provincial Key Laboratory of Data Security and Intelligent Computing, Qufu Normal University, Rizhao 276826, China
| | - Yanfei Guo
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; Shandong Provincial Key Laboratory of Data Security and Intelligent Computing, Qufu Normal University, Rizhao 276826, China
| | - Jianlei Liu
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; Shandong Provincial Key Laboratory of Data Security and Intelligent Computing, Qufu Normal University, Rizhao 276826, China
| |
Collapse
|
82
|
Oh J, Wu D, Hong B, Lee D, Kang M, Li Q, Kim K. Texture-preserving low dose CT image denoising using Pearson divergence. Phys Med Biol 2024; 69:115021. [PMID: 38688292 DOI: 10.1088/1361-6560/ad45a4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 04/30/2024] [Indexed: 05/02/2024]
Abstract
Objective.The mean squared error (MSE), also known asL2loss, has been widely used as a loss function to optimize image denoising models due to its strong performance as a mean estimator of the Gaussian noise model. Recently, various low-dose computed tomography (LDCT) image denoising methods using deep learning combined with the MSE loss have been developed; however, this approach has been observed to suffer from the regression-to-the-mean problem, leading to over-smoothed edges and degradation of texture in the image.Approach.To overcome this issue, we propose a stochastic function in the loss function to improve the texture of the denoised CT images, rather than relying on complicated networks or feature space losses. The proposed loss function includes the MSE loss to learn the mean distribution and the Pearson divergence loss to learn feature textures. Specifically, the Pearson divergence loss is computed in an image space to measure the distance between two intensity measures of denoised low-dose and normal-dose CT images. The evaluation of the proposed model employs a novel approach of multi-metric quantitative analysis utilizing relative texture feature distance.Results.Our experimental results show that the proposed Pearson divergence loss leads to a significant improvement in texture compared to the conventional MSE loss and generative adversarial network (GAN), both qualitatively and quantitatively.Significance.Achieving consistent texture preservation in LDCT is a challenge in conventional GAN-type methods due to adversarial aspects aimed at minimizing noise while preserving texture. By incorporating the Pearson regularizer in the loss function, we can easily achieve a balance between two conflicting properties. Consistent high-quality CT images can significantly help clinicians in diagnoses and supporting researchers in the development of AI-diagnostic models.
Collapse
Affiliation(s)
- Jieun Oh
- Center for Advanced Medical Computing and Analysis (CAMCA), Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, United States of America
- Chungnam National University College of Medicine, Chungnam National University Hospital, Daejeon, 35015, Republic of Korea
| | - Dufan Wu
- Center for Advanced Medical Computing and Analysis (CAMCA), Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, United States of America
| | - Boohwi Hong
- Chungnam National University College of Medicine, Chungnam National University Hospital, Daejeon, 35015, Republic of Korea
| | - Dongheon Lee
- Chungnam National University College of Medicine, Chungnam National University Hospital, Daejeon, 35015, Republic of Korea
| | - Minwoong Kang
- Chungnam National University College of Medicine, Chungnam National University Hospital, Daejeon, 35015, Republic of Korea
| | - Quanzheng Li
- Center for Advanced Medical Computing and Analysis (CAMCA), Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, United States of America
| | - Kyungsang Kim
- Center for Advanced Medical Computing and Analysis (CAMCA), Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, United States of America
| |
Collapse
|
83
|
Teramoto S, Uga Y. Convolutional neural networks combined with conventional filtering to semantically segment plant roots in rapidly scanned X-ray computed tomography volumes with high noise levels. PLANT METHODS 2024; 20:73. [PMID: 38773503 PMCID: PMC11106967 DOI: 10.1186/s13007-024-01208-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Accepted: 05/15/2024] [Indexed: 05/23/2024]
Abstract
BACKGROUND X-ray computed tomography (CT) is a powerful tool for measuring plant root growth in soil. However, a rapid scan with larger pots, which is required for throughput-prioritized crop breeding, results in high noise levels, low resolution, and blurred root segments in the CT volumes. Moreover, while plant root segmentation is essential for root quantification, detailed conditional studies on segmenting noisy root segments are scarce. The present study aimed to investigate the effects of scanning time and deep learning-based restoration of image quality on semantic segmentation of blurry rice (Oryza sativa) root segments in CT volumes. RESULTS VoxResNet, a convolutional neural network-based voxel-wise residual network, was used as the segmentation model. The training efficiency of the model was compared using CT volumes obtained at scan times of 33, 66, 150, 300, and 600 s. The learning efficiencies of the samples were similar, except for scan times of 33 and 66 s. In addition, The noise levels of predicted volumes differd among scanning conditions, indicating that the noise level of a scan time ≥ 150 s does not affect the model training efficiency. Conventional filtering methods, such as median filtering and edge detection, increased the training efficiency by approximately 10% under any conditions. However, the training efficiency of 33 and 66 s-scanned samples remained relatively low. We concluded that scan time must be at least 150 s to not affect segmentation. Finally, we constructed a semantic segmentation model for 150 s-scanned CT volumes, for which the Dice loss reached 0.093. This model could not predict the lateral roots, which were not included in the training data. This limitation will be addressed by preparing appropriate training data. CONCLUSIONS A semantic segmentation model can be constructed even with rapidly scanned CT volumes with high noise levels. Given that scanning times ≥ 150 s did not affect the segmentation results, this technique holds promise for rapid and low-dose scanning. This study offers insights into images other than CT volumes with high noise levels that are challenging to determine when annotating.
Collapse
Affiliation(s)
- Shota Teramoto
- Institute of Crop Sciences, National Agriculture & Food Research Organization, Tsukuba, Ibaraki, 305-8602, Japan.
| | - Yusaku Uga
- Institute of Crop Sciences, National Agriculture & Food Research Organization, Tsukuba, Ibaraki, 305-8602, Japan
| |
Collapse
|
84
|
Im JY, Halliburton SS, Mei K, Perkins AE, Wong E, Roshkovan L, Sandvold OF, Liu LP, Gang GJ, Noël PB. Patient-derived PixelPrint phantoms for evaluating clinical imaging performance of a deep learning CT reconstruction algorithm. Phys Med Biol 2024; 69:115009. [PMID: 38604190 PMCID: PMC11097966 DOI: 10.1088/1361-6560/ad3dba] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 03/22/2024] [Accepted: 04/11/2024] [Indexed: 04/13/2024]
Abstract
Objective. Deep learning reconstruction (DLR) algorithms exhibit object-dependent resolution and noise performance. Thus, traditional geometric CT phantoms cannot fully capture the clinical imaging performance of DLR. This study uses a patient-derived 3D-printed PixelPrint lung phantom to evaluate a commercial DLR algorithm across a wide range of radiation dose levels.Method. The lung phantom used in this study is based on a patient chest CT scan containing ground glass opacities and was fabricated using PixelPrint 3D-printing technology. The phantom was placed inside two different size extension rings to mimic a small- and medium-sized patient and was scanned on a conventional CT scanner at exposures between 0.5 and 20 mGy. Each scan was reconstructed using filtered back projection (FBP), iterative reconstruction, and DLR at five levels of denoising. Image noise, contrast to noise ratio (CNR), root mean squared error, structural similarity index (SSIM), and multi-scale SSIM (MS SSIM) were calculated for each image.Results.DLR demonstrated superior performance compared to FBP and iterative reconstruction for all measured metrics in both phantom sizes, with better performance for more aggressive denoising levels. DLR was estimated to reduce dose by 25%-83% in the small phantom and by 50%-83% in the medium phantom without decreasing image quality for any of the metrics measured in this study. These dose reduction estimates are more conservative compared to the estimates obtained when only considering noise and CNR.Conclusion. DLR has the capability of producing diagnostic image quality at up to 83% lower radiation dose, which can improve the clinical utility and viability of lower dose CT scans. Furthermore, the PixelPrint phantom used in this study offers an improved testing environment with more realistic tissue structures compared to traditional CT phantoms, allowing for structure-based image quality evaluation beyond noise and contrast-based assessments.
Collapse
Affiliation(s)
- Jessica Y Im
- Department of Radiology, University of Pennsylvania, Philadelphia, PA, United States of America
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, United States of America
| | | | - Kai Mei
- Department of Radiology, University of Pennsylvania, Philadelphia, PA, United States of America
| | - Amy E Perkins
- Philips Healthcare, Cleveland, OH, United States of America
| | - Eddy Wong
- Philips Healthcare, Cleveland, OH, United States of America
| | - Leonid Roshkovan
- Department of Radiology, University of Pennsylvania, Philadelphia, PA, United States of America
| | - Olivia F Sandvold
- Department of Radiology, University of Pennsylvania, Philadelphia, PA, United States of America
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, United States of America
| | - Leening P Liu
- Department of Radiology, University of Pennsylvania, Philadelphia, PA, United States of America
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA, United States of America
| | - Grace J Gang
- Department of Radiology, University of Pennsylvania, Philadelphia, PA, United States of America
| | - Peter B Noël
- Department of Radiology, University of Pennsylvania, Philadelphia, PA, United States of America
| |
Collapse
|
85
|
Masayoshi K, Katada Y, Ozawa N, Ibuki M, Negishi K, Kurihara T. Deep learning segmentation of non-perfusion area from color fundus images and AI-generated fluorescein angiography. Sci Rep 2024; 14:10801. [PMID: 38734727 PMCID: PMC11088618 DOI: 10.1038/s41598-024-61561-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Accepted: 05/07/2024] [Indexed: 05/13/2024] Open
Abstract
The non-perfusion area (NPA) of the retina is an important indicator in the visual prognosis of patients with branch retinal vein occlusion (BRVO). However, the current evaluation method of NPA, fluorescein angiography (FA), is invasive and burdensome. In this study, we examined the use of deep learning models for detecting NPA in color fundus images, bypassing the need for FA, and we also investigated the utility of synthetic FA generated from color fundus images. The models were evaluated using the Dice score and Monte Carlo dropout uncertainty. We retrospectively collected 403 sets of color fundus and FA images from 319 BRVO patients. We trained three deep learning models on FA, color fundus images, and synthetic FA. As a result, though the FA model achieved the highest score, the other two models also performed comparably. We found no statistical significance in median Dice scores between the models. However, the color fundus model showed significantly higher uncertainty than the other models (p < 0.05). In conclusion, deep learning models can detect NPAs from color fundus images with reasonable accuracy, though with somewhat less prediction stability. Synthetic FA stabilizes the prediction and reduces misleading uncertainty estimates by enhancing image quality.
Collapse
Affiliation(s)
- Kanato Masayoshi
- Laboratory of Photobiology, Keio University School of Medicine, 35 Shinanomachi, Shinjuku-Ku, Tokyo, Japan
| | - Yusaku Katada
- Laboratory of Photobiology, Keio University School of Medicine, 35 Shinanomachi, Shinjuku-Ku, Tokyo, Japan
- Department of Ophthalmology, Keio University School of Medicine, Shinanomachi, Shinjuku-Ku, Tokyo, Japan
| | - Nobuhiro Ozawa
- Laboratory of Photobiology, Keio University School of Medicine, 35 Shinanomachi, Shinjuku-Ku, Tokyo, Japan
- Department of Ophthalmology, Keio University School of Medicine, Shinanomachi, Shinjuku-Ku, Tokyo, Japan
| | - Mari Ibuki
- Laboratory of Photobiology, Keio University School of Medicine, 35 Shinanomachi, Shinjuku-Ku, Tokyo, Japan
- Department of Ophthalmology, Keio University School of Medicine, Shinanomachi, Shinjuku-Ku, Tokyo, Japan
| | - Kazuno Negishi
- Department of Ophthalmology, Keio University School of Medicine, Shinanomachi, Shinjuku-Ku, Tokyo, Japan
| | - Toshihide Kurihara
- Laboratory of Photobiology, Keio University School of Medicine, 35 Shinanomachi, Shinjuku-Ku, Tokyo, Japan.
- Department of Ophthalmology, Keio University School of Medicine, Shinanomachi, Shinjuku-Ku, Tokyo, Japan.
| |
Collapse
|
86
|
Kang J, Liu Y, Zhang P, Guo N, Wang L, Du Y, Gui Z. FSformer: A combined frequency separation network and transformer for LDCT denoising. Comput Biol Med 2024; 173:108378. [PMID: 38554660 DOI: 10.1016/j.compbiomed.2024.108378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 03/01/2024] [Accepted: 03/24/2024] [Indexed: 04/02/2024]
Abstract
Low-dose computed tomography (LDCT) has been widely concerned in the field of medical imaging because of its low radiation hazard to humans. However, under low-dose radiation scenarios, a large amount of noise/artifacts are present in the reconstructed image, which reduces the clarity of the image and is not conducive to diagnosis. To improve the LDCT image quality, we proposed a combined frequency separation network and Transformer (FSformer) for LDCT denoising. Firstly, FSformer decomposes the LDCT images into low-frequency images and multi-layer high-frequency images by frequency separation blocks. Then, the low-frequency components are fused with the high-frequency components of different layers to remove the noise in the high-frequency components with the help of the potential texture of low-frequency parts. Next, the estimated noise images can be obtained by using Transformer stage in the frequency aggregation denoising block. Finally, they are fed into the reconstruction prediction block to obtain improved quality images. In addition, a compound loss function with frequency loss and Charbonnier loss is used to guide the training of the network. The performance of FSformer has been validated and evaluated on AAPM Mayo dataset, real Piglet dataset and clinical dataset. Compared with previous representative models in different architectures, FSformer achieves the optimal metrics with PSNR of 33.7714 dB and SSIM of 0.9254 on Mayo dataset, the testing time is 1.825 s. The experimental results show that FSformer is a state-of-the-art (SOTA) model with noise/artifact suppression and texture/organization preservation. Moreover, the model has certain robustness and can effectively improve LDCT image quality.
Collapse
Affiliation(s)
- Jiaqi Kang
- State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, 030051, China; School of Information and Communication Engineering, North University of China, Taiyuan, 030051, China
| | - Yi Liu
- State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, 030051, China; School of Information and Communication Engineering, North University of China, Taiyuan, 030051, China
| | - Pengcheng Zhang
- State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, 030051, China; School of Information and Communication Engineering, North University of China, Taiyuan, 030051, China
| | - Niu Guo
- State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, 030051, China; School of Information and Communication Engineering, North University of China, Taiyuan, 030051, China
| | - Lei Wang
- State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, 030051, China; School of Information and Communication Engineering, North University of China, Taiyuan, 030051, China
| | - Yinglin Du
- State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, 030051, China; School of Information and Communication Engineering, North University of China, Taiyuan, 030051, China
| | - Zhiguo Gui
- State Key Laboratory of Dynamic Testing Technology, North University of China, Taiyuan, 030051, China; School of Information and Communication Engineering, North University of China, Taiyuan, 030051, China.
| |
Collapse
|
87
|
Chen Z, Niu C, Gao Q, Wang G, Shan H. LIT-Former: Linking In-Plane and Through-Plane Transformers for Simultaneous CT Image Denoising and Deblurring. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:1880-1894. [PMID: 38194396 DOI: 10.1109/tmi.2024.3351723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2024]
Abstract
This paper studies 3D low-dose computed tomography (CT) imaging. Although various deep learning methods were developed in this context, typically they focus on 2D images and perform denoising due to low-dose and deblurring for super-resolution separately. Up to date, little work was done for simultaneous in-plane denoising and through-plane deblurring, which is important to obtain high-quality 3D CT images with lower radiation and faster imaging speed. For this task, a straightforward method is to directly train an end-to-end 3D network. However, it demands much more training data and expensive computational costs. Here, we propose to link in-plane and through-plane transformers for simultaneous in-plane denoising and through-plane deblurring, termed as LIT-Former, which can efficiently synergize in-plane and through-plane sub-tasks for 3D CT imaging and enjoy the advantages of both convolution and transformer networks. LIT-Former has two novel designs: efficient multi-head self-attention modules (eMSM) and efficient convolutional feed-forward networks (eCFN). First, eMSM integrates in-plane 2D self-attention and through-plane 1D self-attention to efficiently capture global interactions of 3D self-attention, the core unit of transformer networks. Second, eCFN integrates 2D convolution and 1D convolution to extract local information of 3D convolution in the same fashion. As a result, the proposed LIT-Former synergizes these two sub-tasks, significantly reducing the computational complexity as compared to 3D counterparts and enabling rapid convergence. Extensive experimental results on simulated and clinical datasets demonstrate superior performance over state-of-the-art models. The source code is made available at https://github.com/hao1635/LIT-Former.
Collapse
|
88
|
Li X, Jing K, Yang Y, Wang Y, Ma J, Zheng H, Xu Z. Noise-Generating and Imaging Mechanism Inspired Implicit Regularization Learning Network for Low Dose CT Reconstrution. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:1677-1689. [PMID: 38145543 DOI: 10.1109/tmi.2023.3347258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2023]
Abstract
Low-dose computed tomography (LDCT) helps to reduce radiation risks in CT scanning while maintaining image quality, which involves a consistent pursuit of lower incident rays and higher reconstruction performance. Although deep learning approaches have achieved encouraging success in LDCT reconstruction, most of them treat the task as a general inverse problem in either the image domain or the dual (sinogram and image) domains. Such frameworks have not considered the original noise generation of the projection data and suffer from limited performance improvement for the LDCT task. In this paper, we propose a novel reconstruction model based on noise-generating and imaging mechanism in full-domain, which fully considers the statistical properties of intrinsic noises in LDCT and prior information in sinogram and image domains. To solve the model, we propose an optimization algorithm based on the proximal gradient technique. Specifically, we derive the approximate solutions of the integer programming problem on the projection data theoretically. Instead of hand-crafting the sinogram and image regularizers, we propose to unroll the optimization algorithm to be a deep network. The network implicitly learns the proximal operators of sinogram and image regularizers with two deep neural networks, providing a more interpretable and effective reconstruction procedure. Numerical results demonstrate our proposed method improvements of > 2.9 dB in peak signal to noise ratio, > 1.4% promotion in structural similarity metric, and > 9 HU decrements in root mean square error over current state-of-the-art LDCT methods.
Collapse
|
89
|
Krishna A, Yenneti S, Wang G, Mueller K. Image factory: A method for synthesizing novel CT images with anatomical guidance. Med Phys 2024; 51:3464-3479. [PMID: 38043097 PMCID: PMC11076177 DOI: 10.1002/mp.16864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Revised: 09/26/2023] [Accepted: 10/30/2023] [Indexed: 12/05/2023] Open
Abstract
BACKGROUND Deep learning in medical applications is limited due to the low availability of large labeled, annotated, or segmented training datasets. With the insufficient data available for model training comes the inability of these networks to learn the fine nuances of the space of possible images in a given medical domain, leading to the possible suppression of important diagnostic features hence making these deep learning systems suboptimal in their performance and vulnerable to adversarial attacks. PURPOSE We formulate a framework to address this lack of labeled data problem. We test this formulation in computed tomographic images domain and present an approach that can synthesize large sets of novel CT images at high resolution across the full Hounsfield (HU) range. METHODS Our method only requires a small annotated dataset of lung CT from 30 patients (available online at the TCIA) and a large nonannotated dataset with high resolution CT images from 14k patients (received from NIH, not publicly available). It then converts the small annotated dataset into a large annotated dataset, using a sequence of steps including texture learning via StyleGAN, label learning via U-Net and semi-supervised learning via CycleGAN/Pixel-to-Pixel (P2P) architectures. The large annotated dataset so generated can then be used for the training of deep learning networks for medical applications. It can also be put to use for the synthesis of CT images with varied anatomies that were nonexistent within either of the input datasets, enriching the dataset even further. RESULTS We demonstrate our framework via lung CT-Scan synthesis along with their novel generated annotations and compared it with other state of the art generative models that only produce images without annotations. We evaluate our framework effectiveness via a visual turing test with help of a few doctors and radiologists. CONCLUSIONS We gain the capability of generating an unlimited amount of annotated CT images. Our approach works for all HU windows with minimal depreciation in anatomical plausibility and hence could be used as a general purpose framework for annotated data augmentation for deep learning applications in medical imaging.
Collapse
Affiliation(s)
- Arjun Krishna
- Computer Science Department, Stony Brook University, Stony Brook, New York, USA
| | - Shanmukha Yenneti
- Computer Science Department, Stony Brook University, Stony Brook, New York, USA
| | - Ge Wang
- Department of Biomedical Engineering, Rensselaer Polytechnic Institute, Troy, New York, USA
| | - Klaus Mueller
- Computer Science Department, Stony Brook University, Stony Brook, New York, USA
| |
Collapse
|
90
|
Shakya KS, Alavi A, Porteous J, K P, Laddi A, Jaiswal M. A Critical Analysis of Deep Semi-Supervised Learning Approaches for Enhanced Medical Image Classification. INFORMATION 2024; 15:246. [DOI: 10.3390/info15050246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2025] Open
Abstract
Deep semi-supervised learning (DSSL) is a machine learning paradigm that blends supervised and unsupervised learning techniques to improve the performance of various models in computer vision tasks. Medical image classification plays a crucial role in disease diagnosis, treatment planning, and patient care. However, obtaining labeled medical image data is often expensive and time-consuming for medical practitioners, leading to limited labeled datasets. DSSL techniques aim to address this challenge, particularly in various medical image tasks, to improve model generalization and performance. DSSL models leverage both the labeled information, which provides explicit supervision, and the unlabeled data, which can provide additional information about the underlying data distribution. That offers a practical solution to resource-intensive demands of data annotation, and enhances the model’s ability to generalize across diverse and previously unseen data landscapes. The present study provides a critical review of various DSSL approaches and their effectiveness and challenges in enhancing medical image classification tasks. The study categorized DSSL techniques into six classes: consistency regularization method, deep adversarial method, pseudo-learning method, graph-based method, multi-label method, and hybrid method. Further, a comparative analysis of performance for six considered methods is conducted using existing studies. The referenced studies have employed metrics such as accuracy, sensitivity, specificity, AUC-ROC, and F1 score to evaluate the performance of DSSL methods on different medical image datasets. Additionally, challenges of the datasets, such as heterogeneity, limited labeled data, and model interpretability, were discussed and highlighted in the context of DSSL for medical image classification. The current review provides future directions and considerations to researchers to further address the challenges and take full advantage of these methods in clinical practices.
Collapse
Affiliation(s)
- Kaushlesh Singh Shakya
- Academy of Scientific & Innovative Research (AcSIR), Ghaziabad 201002, India
- CSIR-Central Scientific Instruments Organisation, Chandigarh 160030, India
- School of Computing Technologies, RMIT University, Melbourne, VIC 3000, Australia
| | - Azadeh Alavi
- School of Computing Technologies, RMIT University, Melbourne, VIC 3000, Australia
| | - Julie Porteous
- School of Computing Technologies, RMIT University, Melbourne, VIC 3000, Australia
| | - Priti K
- Academy of Scientific & Innovative Research (AcSIR), Ghaziabad 201002, India
- CSIR-Central Scientific Instruments Organisation, Chandigarh 160030, India
| | - Amit Laddi
- Academy of Scientific & Innovative Research (AcSIR), Ghaziabad 201002, India
- CSIR-Central Scientific Instruments Organisation, Chandigarh 160030, India
| | - Manojkumar Jaiswal
- Oral Health Sciences Centre, Post Graduate Institute of Medical Education & Research (PGIMER), Chandigarh 160012, India
| |
Collapse
|
91
|
Liu C, Klein L, Huang Y, Baader E, Lell M, Kachelrieß M, Maier A. Two-view topogram-based anatomy-guided CT reconstruction for prospective risk minimization. Sci Rep 2024; 14:9373. [PMID: 38653993 DOI: 10.1038/s41598-024-59731-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 04/15/2024] [Indexed: 04/25/2024] Open
Abstract
To facilitate a prospective estimation of the effective dose of an CT scan prior to the actual scanning in order to use sophisticated patient risk minimizing methods, a prospective spatial dose estimation and the known anatomical structures are required. To this end, a CT reconstruction method is required to reconstruct CT volumes from as few projections as possible, i.e. by using the topograms, with anatomical structures as correct as possible. In this work, an optimized CT reconstruction model based on a generative adversarial network (GAN) is proposed. The GAN is trained to reconstruct 3D volumes from an anterior-posterior and a lateral CT projection. To enhance anatomical structures, a pre-trained organ segmentation network and the 3D perceptual loss are applied during the training phase, so that the model can then generate both organ-enhanced CT volume and organ segmentation masks. The proposed method can reconstruct CT volumes with PSNR of 26.49, RMSE of 196.17, and SSIM of 0.64, compared to 26.21, 201.55 and 0.63 using the baseline method. In terms of the anatomical structure, the proposed method effectively enhances the organ shapes and boundaries and allows for a straight-forward identification of the relevant anatomical structures. We note that conventional reconstruction metrics fail to indicate the enhancement of anatomical structures. In addition to such metrics, the evaluation is expanded with assessing the organ segmentation performance. The average organ dice of the proposed method is 0.71 compared with 0.63 for the baseline model, indicating the enhancement of anatomical structures.
Collapse
Affiliation(s)
- Chang Liu
- Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany.
| | - Laura Klein
- Division of X-Ray Imaging and Computed Tomography, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Department of Physics and Astronomy, Ruprecht-Karls-University Heidelberg, Heidelberg, Germany
| | - Yixing Huang
- Department of Radiation Oncology, Universitätsklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| | - Edith Baader
- Division of X-Ray Imaging and Computed Tomography, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Department of Physics and Astronomy, Ruprecht-Karls-University Heidelberg, Heidelberg, Germany
| | - Michael Lell
- Department of Radiology and Nuclear Medicine, Klinikum Nürnberg, Paracelsus Medical University, Nuremberg, Germany
| | - Marc Kachelrieß
- Division of X-Ray Imaging and Computed Tomography, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Medical Faculty, Ruprecht-Karls-University Heidelberg, Heidelberg, Germany
| | - Andreas Maier
- Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany
| |
Collapse
|
92
|
Broll A, Rosentritt M, Schlegl T, Goldhacker M. A data-driven approach for the partial reconstruction of individual human molar teeth using generative deep learning. Front Artif Intell 2024; 7:1339193. [PMID: 38690195 PMCID: PMC11058210 DOI: 10.3389/frai.2024.1339193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 03/19/2024] [Indexed: 05/02/2024] Open
Abstract
Background and objective Due to the high prevalence of dental caries, fixed dental restorations are regularly required to restore compromised teeth or replace missing teeth while retaining function and aesthetic appearance. The fabrication of dental restorations, however, remains challenging due to the complexity of the human masticatory system as well as the unique morphology of each individual dentition. Adaptation and reworking are frequently required during the insertion of fixed dental prostheses (FDPs), which increase cost and treatment time. This article proposes a data-driven approach for the partial reconstruction of occlusal surfaces based on a data set that comprises 92 3D mesh files of full dental crown restorations. Methods A Generative Adversarial Network (GAN) is considered for the given task in view of its ability to represent extensive data sets in an unsupervised manner with a wide variety of applications. Having demonstrated good capabilities in terms of image quality and training stability, StyleGAN-2 has been chosen as the main network for generating the occlusal surfaces. A 2D projection method is proposed in order to generate 2D representations of the provided 3D tooth data set for integration with the StyleGAN architecture. The reconstruction capabilities of the trained network are demonstrated by means of 4 common inlay types using a Bayesian Image Reconstruction method. This involves pre-processing the data in order to extract the necessary information of the tooth preparations required for the used method as well as the modification of the initial reconstruction loss. Results The reconstruction process yields satisfactory visual and quantitative results for all preparations with a root mean square error (RMSE) ranging from 0.02 mm to 0.18 mm. When compared against a clinical procedure for CAD inlay fabrication, the group of dentists preferred the GAN-based restorations for 3 of the total 4 inlay geometries. Conclusions This article shows the effectiveness of the StyleGAN architecture with a downstream optimization process for the reconstruction of 4 different inlay geometries. The independence of the reconstruction process and the initial training of the GAN enables the application of the method for arbitrary inlay geometries without time-consuming retraining of the GAN.
Collapse
Affiliation(s)
- Alexander Broll
- Department of Prosthetic Dentistry, University Hospital Regensburg, Regensburg, Germany
- Faculty of Mechanical Engineering, Ostbayerische Technische Hochschule Regensburg, Regensburg, Germany
| | - Martin Rosentritt
- Department of Prosthetic Dentistry, University Hospital Regensburg, Regensburg, Germany
| | - Thomas Schlegl
- Faculty of Mechanical Engineering, Ostbayerische Technische Hochschule Regensburg, Regensburg, Germany
| | - Markus Goldhacker
- Faculty of Mechanical Engineering, Ostbayerische Technische Hochschule Regensburg, Regensburg, Germany
| |
Collapse
|
93
|
Wang L, Meng M, Chen S, Bian Z, Zeng D, Meng D, Ma J. Semi-supervised iterative adaptive network for low-dose CT sinogram recovery. Phys Med Biol 2024; 69:085013. [PMID: 38422540 DOI: 10.1088/1361-6560/ad2ee7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Accepted: 02/29/2024] [Indexed: 03/02/2024]
Abstract
Background.Concern has been expressed regarding the risk of carcinogenesis from medical computed tomography (CT) radiation. Lowering radiation in CT without appropriate modifications often leads to severe noise-induced artifacts in the images. The utilization of deep learning (DL) techniques has achieved promising reconstruction performance in low-dose CT (LDCT) imaging. However, most DL-based algorithms require the pre-collection of a large set of image pairs (low-dose/standard-dose) and the training of networks in an end-to-end supervised manner. Meanwhile, securing such a large volume of paired, well-registered training data in clinical practice is challenging. Moreover, these algorithms often overlook the potential to utilize the abundant information in a large collection of LDCT-only images/sinograms.Methods.In this paper, we introduce a semi-supervised iterative adaptive network (SIA-Net) for LDCT imaging, utilizing both labeled and unlabeled sinograms in a cohesive network framework, integrating supervised and unsupervised learning processes. Specifically, the supervised process captures critical features (i.e. noise distribution and tissue characteristics) latent in the paired sinograms, while the unsupervised process effectively learns these features in the unlabeled low-dose sinograms, employing a conventional weighted least-squares model with a regularization term. Furthermore, the SIA-Net method is designed to adaptively transfer the learned feature distribution from the supervised to the unsupervised process, thereby obtaining a high-fidelity sinogram through iterative adaptive learning. Finally, high-quality CT images can be reconstructed from the refined sinogram using the filtered back-projection algorithm.Results.Experimental results on two clinical datasets indicate that the proposed SIA-Net method achieves competitive performance in terms of noise reduction and structure preservation in LDCT imaging, when compared to traditional supervised learning methods.
Collapse
Affiliation(s)
- Lei Wang
- School of Future Technology, Xi'an Jiaotong University, Xi'an 710049, People's Republic of China
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, People's Republic of China
| | - Mingqiang Meng
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, People's Republic of China
- Pazhou Lab (Huangpu), Guangdong, People's Republic of China
| | - Shixuan Chen
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, People's Republic of China
| | - Zhaoying Bian
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, People's Republic of China
- Pazhou Lab (Huangpu), Guangdong, People's Republic of China
| | - Dong Zeng
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, People's Republic of China
- Pazhou Lab (Huangpu), Guangdong, People's Republic of China
- Department of Radiology, Zhujiang Hospital, Southern Medical University, Guangdong, People's Republic of China
| | - Deyu Meng
- School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an 710049, People's Republic of China
| | - Jianhua Ma
- School of Biomedical Engineering, Southern Medical University, Guangzhou 510515, People's Republic of China
- School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi, People's Republic of China
| |
Collapse
|
94
|
Yang G, Li C, Yao Y, Wang G, Teng Y. Quasi-supervised learning for super-resolution PET. Comput Med Imaging Graph 2024; 113:102351. [PMID: 38335784 DOI: 10.1016/j.compmedimag.2024.102351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 01/15/2024] [Accepted: 02/02/2024] [Indexed: 02/12/2024]
Abstract
Low resolution of positron emission tomography (PET) limits its diagnostic performance. Deep learning has been successfully applied to achieve super-resolution PET. However, commonly used supervised learning methods in this context require many pairs of low- and high-resolution (LR and HR) PET images. Although unsupervised learning utilizes unpaired images, the results are not as good as that obtained with supervised deep learning. In this paper, we propose a quasi-supervised learning method, which is a new type of weakly-supervised learning methods, to recover HR PET images from LR counterparts by leveraging similarity between unpaired LR and HR image patches. Specifically, LR image patches are taken from a patient as inputs, while the most similar HR patches from other patients are found as labels. The similarity between the matched HR and LR patches serves as a prior for network construction. Our proposed method can be implemented by designing a new network or modifying an existing network. As an example in this study, we have modified the cycle-consistent generative adversarial network (CycleGAN) for super-resolution PET. Our numerical and experimental results qualitatively and quantitatively show the merits of our method relative to the state-of-the-art methods. The code is publicly available at https://github.com/PigYang-ops/CycleGAN-QSDL.
Collapse
Affiliation(s)
- Guangtong Yang
- College of Medicine and Biomedical Information Engineering, Northeastern University, 110004 Shenyang, China
| | - Chen Li
- College of Medicine and Biomedical Information Engineering, Northeastern University, 110004 Shenyang, China
| | - Yudong Yao
- Department of Electrical and Computer Engineering, Stevens Institute of Technology, Hoboken, NJ, USA
| | - Ge Wang
- Department of Biomedical Engineering, Rensselaer Polytechnic Institute, Troy, NY, USA
| | - Yueyang Teng
- College of Medicine and Biomedical Information Engineering, Northeastern University, 110004 Shenyang, China.
| |
Collapse
|
95
|
Shi Z, Kong F, Cheng M, Cao H, Ouyang S, Cao Q. Multi-energy CT material decomposition using graph model improved CNN. Med Biol Eng Comput 2024; 62:1213-1228. [PMID: 38159238 DOI: 10.1007/s11517-023-02986-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 11/30/2023] [Indexed: 01/03/2024]
Abstract
In spectral CT imaging, the coefficient image of the basis material obtained by the material decomposition technique can estimate the tissue composition, and its accuracy directly affects the disease diagnosis. Although the precision of material decomposition is increased by employing convolutional neural networks (CNN), extracting the non-local features from the CT image is restricted using the traditional CNN convolution operator. A graph model built by multi-scale non-local self-similar patterns is introduced into multi-material decomposition (MMD). We proposed a novel MMD method based on graph edge-conditioned convolution U-net (GECCU-net) to enhance material image quality. The GECCU-net focuses on developing a multi-scale encoder. At the network coding stage, three paths are applied to capture comprehensive image features. The local and non-local feature aggregation (LNFA) blocks are designed to integrate the local and non-local features from different paths. The graph edge-conditioned convolution based on non-Euclidean space excavates the non-local features. A hybrid loss function is defined to accommodate multi-scale input images and avoid over-smoothing of results. The proposed network is compared quantitatively with base CNN models on the simulated and real datasets. The material images generated by GECCU-net have less noise and artifacts while retaining more information on tissue. The Structural SIMilarity (SSIM) of the obtained abdomen and chest water maps reaches 0.9976 and 0.9990, respectively, and the RMSE reduces to 0.1218 and 0.4903 g/cm3. The proposed method can improve MMD performance and has potential applications.
Collapse
Affiliation(s)
- Zaifeng Shi
- School of Microelectronics, Tianjin University, Tianjin, 300072, China.
- Tianjin Key Laboratory of Imaging and Sensing Microelectronic Technology, Tianjin, China.
| | - Fanning Kong
- School of Microelectronics, Tianjin University, Tianjin, 300072, China
| | - Ming Cheng
- School of Microelectronics, Tianjin University, Tianjin, 300072, China
| | - Huaisheng Cao
- School of Microelectronics, Tianjin University, Tianjin, 300072, China
| | - Shunxin Ouyang
- School of Microelectronics, Tianjin University, Tianjin, 300072, China
| | - Qingjie Cao
- School of Mathematical Sciences, Tianjin Normal University, Tianjin, 300387, China
| |
Collapse
|
96
|
Fu M, Zhang N, Huang Z, Zhou C, Zhang X, Yuan J, He Q, Yang Y, Zheng H, Liang D, Wu FX, Fan W, Hu Z. OIF-Net: An Optical Flow Registration-Based PET/MR Cross-Modal Interactive Fusion Network for Low-Count Brain PET Image Denoising. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:1554-1567. [PMID: 38096101 DOI: 10.1109/tmi.2023.3342809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
The short frames of low-count positron emission tomography (PET) images generally cause high levels of statistical noise. Thus, improving the quality of low-count images by using image postprocessing algorithms to achieve better clinical diagnoses has attracted widespread attention in the medical imaging community. Most existing deep learning-based low-count PET image enhancement methods have achieved satisfying results, however, few of them focus on denoising low-count PET images with the magnetic resonance (MR) image modality as guidance. The prior context features contained in MR images can provide abundant and complementary information for single low-count PET image denoising, especially in ultralow-count (2.5%) cases. To this end, we propose a novel two-stream dual PET/MR cross-modal interactive fusion network with an optical flow pre-alignment module, namely, OIF-Net. Specifically, the learnable optical flow registration module enables the spatial manipulation of MR imaging inputs within the network without any extra training supervision. Registered MR images fundamentally solve the problem of feature misalignment in the multimodal fusion stage, which greatly benefits the subsequent denoising process. In addition, we design a spatial-channel feature enhancement module (SC-FEM) that considers the interactive impacts of multiple modalities and provides additional information flexibility in both the spatial and channel dimensions. Furthermore, instead of simply concatenating two extracted features from these two modalities as an intermediate fusion method, the proposed cross-modal feature fusion module (CM-FFM) adopts cross-attention at multiple feature levels and greatly improves the two modalities' feature fusion procedure. Extensive experimental assessments conducted on real clinical datasets, as well as an independent clinical testing dataset, demonstrate that the proposed OIF-Net outperforms the state-of-the-art methods.
Collapse
|
97
|
Sherwani MK, Gopalakrishnan S. A systematic literature review: deep learning techniques for synthetic medical image generation and their applications in radiotherapy. FRONTIERS IN RADIOLOGY 2024; 4:1385742. [PMID: 38601888 PMCID: PMC11004271 DOI: 10.3389/fradi.2024.1385742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Accepted: 03/11/2024] [Indexed: 04/12/2024]
Abstract
The aim of this systematic review is to determine whether Deep Learning (DL) algorithms can provide a clinically feasible alternative to classic algorithms for synthetic Computer Tomography (sCT). The following categories are presented in this study: ∙ MR-based treatment planning and synthetic CT generation techniques. ∙ Generation of synthetic CT images based on Cone Beam CT images. ∙ Low-dose CT to High-dose CT generation. ∙ Attenuation correction for PET images. To perform appropriate database searches, we reviewed journal articles published between January 2018 and June 2023. Current methodology, study strategies, and results with relevant clinical applications were analyzed as we outlined the state-of-the-art of deep learning based approaches to inter-modality and intra-modality image synthesis. This was accomplished by contrasting the provided methodologies with traditional research approaches. The key contributions of each category were highlighted, specific challenges were identified, and accomplishments were summarized. As a final step, the statistics of all the cited works from various aspects were analyzed, which revealed that DL-based sCTs have achieved considerable popularity, while also showing the potential of this technology. In order to assess the clinical readiness of the presented methods, we examined the current status of DL-based sCT generation.
Collapse
Affiliation(s)
- Moiz Khan Sherwani
- Section for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | | |
Collapse
|
98
|
Yin Z, Wu P, Manohar A, McVeigh ER, Pack JD. Protocol Optimization for Functional Cardiac CT Imaging Using Noise Emulation in the Raw Data Domain. ARXIV 2024:arXiv:2403.08486v1. [PMID: 38560739 PMCID: PMC10980088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Background Four-dimensional (4D) wide coverage computed tomography (CT) is an effective imaging modality for measuring the mechanical function of the myocardium. However, repeated CT measurement across a number of heartbeats is still a concern. Purpose A projection-domain noise emulation method is presented to generate accurate low-dose (mA modulated) 4D cardiac CT scans from high-dose scans, enabling protocol optimization to deliver sufficient image quality for functional cardiac analysis while using a dose level that is as low as reasonably achievable (ALARA). Methods Given a targeted low-dose mA modulation curve, the proposed noise emulation method injects both quantum and electronic noise of proper magnitude and correlation to the high-dose data in projection domain. A spatially varying (i.e., channel-dependent) detector gain term as well as its calibration method were proposed to further improve the noise emulation accuracy. To determine the ALARA dose threshold, a straightforward projection domain image quality (IQ) metric was proposed that is based on the number of projection rays that do not fall under the non-linear region of the detector response. Experiments were performed to validate the noise emulation method with both phantom and clinical data in terms of visual similarity, contrast-to-noise ratio (CNR), and noise-power spectrum (NPS). Results For both phantom and clinical data, the low-dose emulated images exhibited similar noise magnitude (CNR difference within 2%), artifacts, and texture to that of the real low-dose images. The proposed channel-dependent detector gain term resulted in additional increase in emulation accuracy. Using the proposed IQ metric, recommended kVp and mA settings were calculated for low dose 4D Cardiac CT acquisitions for patients of different sizes. Conclusions A detailed method to estimate system-dependent parameters for a raw-data based low dose emulation framework was described. The method produced realistic noise levels, artifacts, and texture with phantom and clinical studies. The proposed low-dose emulation method can be used to prospectively select patient-specific minimal-dose protocols for functional cardiac CT.
Collapse
Affiliation(s)
- Zhye Yin
- GE HealthCare, Waukesha, WI, USA
| | - Pengwei Wu
- GE Research Healthcare, Niskayuna, NY, USA
| | - Ashish Manohar
- Dept. of Medicine, Stanford University, Palo Alto, CA, USA
| | - Elliot R. McVeigh
- Dept. of Bioengineering, Medicine, Radiology at University of California San Diego, San Diego, CA, USA
| | | |
Collapse
|
99
|
Luo SH, Pan SQ, Chen GY, Xie Y, Ren B, Liu GK, Tian ZQ. Revealing the Denoising Principle of Zero-Shot N2N-Based Algorithm from 1D Spectrum to 2D Image. Anal Chem 2024; 96:4086-4092. [PMID: 38412039 DOI: 10.1021/acs.analchem.3c04608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/29/2024]
Abstract
Denoising is a necessary step in image analysis to extract weak signals, especially those hardly identified by the naked eye. Unlike the data-driven deep-learning denoising algorithms relying on a clean image as the reference, Noise2Noise (N2N) was able to denoise the noise image, providing sufficiently noise images with the same subject but randomly distributed noise. Further, by introducing data augmentation to create a big data set and regularization to prevent model overfitting, zero-shot N2N-based denoising was proposed in which only a single noisy image was needed. Although various N2N-based denoising algorithms have been developed with high performance, their complicated black box operation prevented the lightweight. Therefore, to reveal the working function of the zero-shot N2N-based algorithm, we proposed a lightweight Peak2Peak algorithm (P2P) and qualitatively and quantitatively analyzed its denoising behavior on the 1D spectrum and 2D image. We found that the high-performance denoising originates from the trade-off balance between the loss function and regularization in the denoising module, where regularization is the switch of denoising. Meanwhile, the signal extraction is mainly from the self-supervised characteristic learning in the data augmentation module. Further, the lightweight P2P improved the denoising speed by at least ten times but with little performance loss, compared with that of the current N2N-based algorithms. In general, the visualization of P2P provides a reference for revealing the working function of zero-shot N2N-based algorithms, which would pave the way for the application of these algorithms toward real-time (in situ, in vivo, and operando) research improving both temporal and spatial resolutions. The P2P is open-source at https://github.com/3331822w/Peak2Peakand will be accessible online access at https://ramancloud.xmu.edu.cn/tutorial.
Collapse
Affiliation(s)
- Si-Heng Luo
- State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Si-Qi Pan
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Gan-Yu Chen
- State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Yi Xie
- Fujian Key Laboratory of Sensing and Computing for Smart City, School of Information Science and Engineering, Xiamen University, Xiamen, Fujian 361005, China
- Shenzhen Research Institute of Xiamen University, Xiamen University, Shenzhen 518000, China
| | - Bin Ren
- State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
- Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen 361005, China
| | - Guo-Kun Liu
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Zhong-Qun Tian
- State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
- Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen 361005, China
| |
Collapse
|
100
|
Xiong L, Li N, Qiu W, Luo Y, Li Y, Zhang Y. Re-UNet: a novel multi-scale reverse U-shape network architecture for low-dose CT image reconstruction. Med Biol Eng Comput 2024; 62:701-712. [PMID: 37982956 DOI: 10.1007/s11517-023-02966-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 11/03/2023] [Indexed: 11/21/2023]
Abstract
In recent years, the growing awareness of public health has brought attention to low-dose computed tomography (LDCT) scans. However, the CT image generated in this way contains a lot of noise or artifacts, which make increasing researchers to investigate methods to enhance image quality. The advancement of deep learning technology has provided researchers with novel approaches to enhance the quality of LDCT images. In the past, numerous studies based on convolutional neural networks (CNN) have yielded remarkable results in LDCT image reconstruction. Nonetheless, they all tend to continue to design new networks based on the fixed network architecture of UNet shape, which also leads to more and more complex networks. In this paper, we proposed a novel network model with a reverse U-shape architecture for the noise reduction in the LDCT image reconstruction task. In the model, we further designed a novel multi-scale feature extractor and edge enhancement module that yields a positive impact on CT images to exhibit strong structural characteristics. Evaluated on a public dataset, the experimental results demonstrate that the proposed model outperforms the compared algorithms based on traditional U-shaped architecture in terms of preserving texture details and reducing noise, as demonstrated by achieving the highest PSNR, SSIM and RMSE value. This study may shed light on the reverse U-shaped network architecture for CT image reconstruction, and could investigate the potential on other medical image processing.
Collapse
Affiliation(s)
- Lianjin Xiong
- School of Computer Science and Technology, Laboratory for Brain Science and Medical Artificial Intelligence, Southwest University of Science and Technology, Mianyang, 621010, China
| | - Ning Li
- School of Computer Science and Technology, Laboratory for Brain Science and Medical Artificial Intelligence, Southwest University of Science and Technology, Mianyang, 621010, China
| | - Wei Qiu
- School of Computer Science and Technology, Laboratory for Brain Science and Medical Artificial Intelligence, Southwest University of Science and Technology, Mianyang, 621010, China
| | - Yiqian Luo
- School of Computer Science and Technology, Laboratory for Brain Science and Medical Artificial Intelligence, Southwest University of Science and Technology, Mianyang, 621010, China
| | - Yishi Li
- Department of Respiratory and Critical Care Medicine, The First Affiliated Hospital of Chongqing Medical University, Chongqing, 400016, China
| | - Yangsong Zhang
- School of Computer Science and Technology, Laboratory for Brain Science and Medical Artificial Intelligence, Southwest University of Science and Technology, Mianyang, 621010, China.
- NHC Key Laboratory of Nuclear Technology Medical Transformation (MIANYANG CENTRAL HOSPITAL), Mianyang, 621000, China.
- Key Laboratory of Testing Technology for Manufacturing Process, Ministry of Education, Southwest University of Science and Technology, Mianyang, 621010, China.
| |
Collapse
|