1
|
Wang W, An L, Han G. Multi-scale geometric transformer for sparse-view X-ray 3D foot reconstruction. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2025:8953996251319194. [PMID: 40276863 DOI: 10.1177/08953996251319194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2025]
Abstract
BACKGROUND Sparse-View X-ray 3D Foot Reconstruction aims to reconstruct the three-dimensional structure of the foot from sparse-view X-ray images, a challenging task due to data sparsity and limited viewpoints. OBJECTIVE This paper presents a novel method using a multi-scale geometric Transformer to enhance reconstruction accuracy and detail representation. METHODS Geometric position encoding technology and a window mechanism are introduced to divide X-ray images into local areas, finely capturing local features. A multi-scale Transformer module based on Neural Radiance Fields (NeRF) enhances the model's ability to express and capture details in complex structures. An adaptive weight learning strategy further optimizes the Transformer's feature extraction and long-range dependency modelling. RESULTS Experimental results demonstrate that the proposed method significantly improves the reconstruction accuracy and detail preservation of the foot structure under sparse-view X-ray conditions. The multi-scale geometric Transformer effectively captures local and global features, leading to more accurate and detailed 3D reconstructions. CONCLUSIONS The proposed method advances medical image reconstruction, significantly improving the accuracy and detail preservation of 3D foot reconstructions from sparse-view X-ray images.
Collapse
Affiliation(s)
- Wei Wang
- Department of Orthopedic Surgery, Yuzhou City People's Hospital, Yuzhou, China
| | - Li An
- School of Information Science and Technology, Northwest University, Xian China
| | - Gengyin Han
- Department of Orthopedic Surgery, Yuzhou City People's Hospital, Yuzhou, China
| |
Collapse
|
2
|
Liu J, Wu F, Zhan G, Wang K, Zhang Y, Hu D, Chen Y. DECT sparse reconstruction based on hybrid spectrum data generative diffusion model. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 261:108597. [PMID: 39809092 DOI: 10.1016/j.cmpb.2025.108597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2024] [Revised: 12/30/2024] [Accepted: 01/08/2025] [Indexed: 01/16/2025]
Abstract
PURPOSE Dual-energy computed tomography (DECT) enables the differentiation of different materials. Additionally, DECT images consist of multiple scans of the same sample, revealing information similarity within the energy domain. To leverage this information similarity and address safety concerns related to excessive radiation exposure in DECT imaging, sparse view DECT imaging is proposed as a solution. However, this imaging method can impact image quality. Therefore, this paper presents a hybrid spectrum data generative diffusion reconstruction model (HSGDM) to improve imaging quality. METHOD To exploit the spectral similarity of DECT, we use interleaved angles for sparse scanning to obtain low- and high-energy CT images with complementary incomplete views. Furthermore, we organize low- and high-energy CT image views into multichannel forms for training and inference and promote information exchange between low-energy features and high-energy features, thus improving the reconstruction quality while reducing the radiation dose. In the HSGDM, we build two types of diffusion model constraint terms trained by the image space and wavelet space. The wavelet space diffusion model exploits mainly the orientation and scale features of artifacts. By integrating the image space diffusion model, we establish a hybrid constraint for the iterative reconstruction framework. Ultimately, we transform the iterative approach into a cohesive sampling process guided by the measurement data, which collaboratively produces high-quality and consistent reconstructions of sparse view DECT. RESULTS Compared with the comparison methods, this approach is competitive in terms of the precision of the CT values, the preservation of details, and the elimination of artifacts. In the reconstruction of 30 sparse views, with increases of 3.51 dB for the peak signal-to-noise ratio (PSNR), 0.03 for the structural similarity index measure (SSIM), and a reduction of 74.47 for the Fréchet inception distance (FID) score on the test dataset. In the ablation study, we determined the effectiveness of our proposed hybrid prior, consisting of the wavelet prior module and the image prior module, by comparing the visual effects and quantitative results of the methods using an image space model, a wavelet space model, and our hybrid model approach. Both qualitative and quantitative analyses of the results indicate that the proposed method performs well in sparse DECT reconstruction tasks. CONCLUSION We have developed a unified optimized mathematical model that integrates the image space and wavelet space prior knowledge into an iterative model. This model is more practical and interpretable than existing approaches are. The experimental results demonstrate the competitive performance of the proposed model.
Collapse
Affiliation(s)
- Jin Liu
- College of Computer and Information, Anhui Polytechnic University, Wuhu, China; Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing, China.
| | - Fan Wu
- College of Computer and Information, Anhui Polytechnic University, Wuhu, China
| | - Guorui Zhan
- College of Computer and Information, Anhui Polytechnic University, Wuhu, China
| | - Kun Wang
- College of Computer and Information, Anhui Polytechnic University, Wuhu, China
| | - Yikun Zhang
- Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing, China; School of Computer Science and Engineering, Southeast University, Nanjing, China.
| | - Dianlin Hu
- The Department of Health Technology and Informatics, Hong Kong Polytechnic University, Hong Kong Special Administrative Region of China
| | - Yang Chen
- Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing, China; School of Computer Science and Engineering, Southeast University, Nanjing, China
| |
Collapse
|
3
|
Ma X, Zou M, Fang X, Luo G, Wang W, Dong S, Li X, Wang K, Dong Q, Tian Y, Li S. Convergent-Diffusion Denoising Model for multi-scenario CT Image Reconstruction. Comput Med Imaging Graph 2025; 120:102491. [PMID: 39787736 DOI: 10.1016/j.compmedimag.2024.102491] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 10/27/2024] [Accepted: 12/31/2024] [Indexed: 01/12/2025]
Abstract
A generic and versatile CT Image Reconstruction (CTIR) scheme can efficiently mitigate imaging noise resulting from inherent physical limitations, substantially bolstering the dependability of CT imaging diagnostics across a wider spectrum of patient cases. Current CTIR techniques often concentrate on distinct areas such as Low-Dose CT denoising (LDCTD), Sparse-View CT reconstruction (SVCTR), and Metal Artifact Reduction (MAR). Nevertheless, due to the intricate nature of multi-scenario CTIR, these techniques frequently narrow their focus to specific tasks, resulting in limited generalization capabilities for diverse scenarios. We propose a novel Convergent-Diffusion Denoising Model (CDDM) for multi-scenario CTIR, which utilizes a stepwise denoising process to converge toward an imaging-noise-free image with high generalization. CDDM uses a diffusion-based process based on a priori decay distribution to steadily correct imaging noise, thus avoiding the overfitting of individual samples. Within CDDM, a domain-correlated sampling network (DS-Net) provides an innovative sinogram-guided noise prediction scheme to leverage both image and sinogram (i.e., dual-domain) information. DS-Net analyzes the correlation of the dual-domain representations for sampling the noise distribution, introducing sinogram semantics to avoid secondary artifacts. Experimental results validate the practical applicability of our scheme across various CTIR scenarios, including LDCTD, MAR, and SVCTR, with the support of sinogram knowledge.
Collapse
Affiliation(s)
- Xinghua Ma
- The Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang, China; The Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Makkah, Saudi Arabia
| | - Mingye Zou
- The Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Xinyan Fang
- The Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Gongning Luo
- The Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang, China; The Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, Makkah, Saudi Arabia
| | - Wei Wang
- The Faculty of Computing, Harbin Institute of Technology, Shenzhen, Guangdong, China.
| | - Suyu Dong
- The College of Computer and Control Engineering, Northeast Forestry University, Harbin, Heilongjiang, China.
| | - Xiangyu Li
- The Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang, China.
| | - Kuanquan Wang
- The Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Qing Dong
- The Department of Thoracic Surgery at No. 4 Affiliated Hospital, Harbin Medical University, Harbin, Heilongjiang, China
| | - Ye Tian
- The Department of Cardiology at No. 1 Affiliated Hospital, Harbin Medical University, Harbin, Heilongjiang, China
| | - Shuo Li
- The Department of Computer and Data Science, Case Western Reserve University, Cleveland, OH, USA; The Department of Biomedical Engineering, Case Western Reserve University, Cleveland, OH, USA
| |
Collapse
|
4
|
Li Y, Sun X, Wang S, Guo L, Qin Y, Pan J, Chen P. TD-STrans: Tri-domain sparse-view CT reconstruction based on sparse transformer. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 260:108575. [PMID: 39733746 DOI: 10.1016/j.cmpb.2024.108575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2024] [Revised: 12/15/2024] [Accepted: 12/24/2024] [Indexed: 12/31/2024]
Abstract
BACKGROUND AND OBJECTIVE Sparse-view computed tomography (CT) speeds up scanning and reduces radiation exposure in medical diagnosis. However, when the projection views are severely under-sampled, deep learning-based reconstruction methods often suffer from over-smoothing of the reconstructed images due to the lack of high-frequency information. To address this issue, we introduce frequency domain information into the popular projection-image domain reconstruction, proposing a Tri-Domain sparse-view CT reconstruction model based on Sparse Transformer (TD-STrans). METHODS TD-STrans integrates three essential modules: the projection recovery module completes the sparse-view projection, the Fourier domain filling module mitigates artifacts and over-smoothing by filling in missing high-frequency details; the image refinement module further enhances and preserves image details. Additionally, a multi-domain joint loss function is designed to simultaneously enhance the reconstruction quality in the projection domain, image domain, and frequency domain, thereby further improving the preservation of image details. RESULTS The results of simulation experiments on the lymph node dataset and real experiments on the walnut dataset consistently demonstrate the effectiveness of TD-STrans in artifact removal, suppression of over-smoothing, and preservation of structural fidelity. CONCLUSION The reconstruction results of TD-STrans indicate that sparse transformer across multiple domains can alleviate over-smoothing and detail loss caused by reduced views, offering a novel solution for ultra-sparse-view CT imaging.
Collapse
Affiliation(s)
- Yu Li
- Department of Information and Communication Engineering, North University of China, Taiyuan 030051, China; The State Key Lab for Electronic Testing Technology, North University of China, Taiyuan 030051, China
| | - Xueqin Sun
- Department of Information and Communication Engineering, North University of China, Taiyuan 030051, China; The State Key Lab for Electronic Testing Technology, North University of China, Taiyuan 030051, China
| | - Sukai Wang
- The State Key Lab for Electronic Testing Technology, North University of China, Taiyuan 030051, China; Department of computer science and technology, North University of China, Taiyuan 030051, China
| | - Lina Guo
- Department of Information and Communication Engineering, North University of China, Taiyuan 030051, China; The State Key Lab for Electronic Testing Technology, North University of China, Taiyuan 030051, China
| | - Yingwei Qin
- Department of Information and Communication Engineering, North University of China, Taiyuan 030051, China; The State Key Lab for Electronic Testing Technology, North University of China, Taiyuan 030051, China
| | - Jinxiao Pan
- Department of Information and Communication Engineering, North University of China, Taiyuan 030051, China; The State Key Lab for Electronic Testing Technology, North University of China, Taiyuan 030051, China
| | - Ping Chen
- Department of Information and Communication Engineering, North University of China, Taiyuan 030051, China; The State Key Lab for Electronic Testing Technology, North University of China, Taiyuan 030051, China.
| |
Collapse
|
5
|
Zhao X, Du Y, Peng Y. Deep Learning-Based Multi-View Projection Synthesis Approach for Improving the Quality of Sparse-View CBCT in Image-Guided Radiotherapy. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025:10.1007/s10278-025-01390-0. [PMID: 39849201 DOI: 10.1007/s10278-025-01390-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/09/2024] [Revised: 12/16/2024] [Accepted: 12/19/2024] [Indexed: 01/25/2025]
Abstract
While radiation hazards induced by cone-beam computed tomography (CBCT) in image-guided radiotherapy (IGRT) can be reduced by sparse-view sampling, the image quality is inevitably degraded. We propose a deep learning-based multi-view projection synthesis (DLMPS) approach to improve the quality of sparse-view low-dose CBCT images. In the proposed DLMPS approach, linear interpolation was first applied to sparse-view projections and the projections were rearranged into sinograms; these sinograms were processed with a sinogram restoration model and then rearranged back into projections. The sinogram restoration model was modified from the 2D U-Net by incorporating dynamic convolutional layers and residual learning techniques. The DLMPS approach was trained, validated, and tested on CBCT data from 163, 30, and 30 real patients respectively. Sparse-view projection datasets with 1/4 and 1/8 of the original sampling rate were simulated, and the corresponding full-view projection datasets were restored via the DLMPS approach. Tomographic images were reconstructed using the Feldkamp-Davis-Kress algorithm. Quantitative metrics including root-mean-square error (RMSE), peak signal-to-noise ratio (PSNR), structural similarity (SSIM), and feature similarity (FSIM) were calculated in both the projection and image domains to evaluate the performance of the DLMPS approach. The DLMPS approach was compared with 11 state-of-the-art (SOTA) models, including CNN and Transformer architectures. For 1/4 sparse-view reconstruction task, the proposed DLMPS approach achieved averaged RMSE, PSNR, SSIM, and FSIM values of 0.0271, 45.93 dB, 0.9817, and 0.9587 in the projection domain, and 0.000885, 37.63 dB, 0.9074, and 0.9885 in the image domain, respectively. For 1/8 sparse-view reconstruction task, the DLMPS approach achieved averaged RMSE, PSNR, SSIM, and FSIM values of 0.0304, 44.85 dB, 0.9785, and 0.9524 in the projection domain, and 0.001057, 36.05 dB, 0.8786, and 0.9774 in the image domain, respectively. The DLMPS approach outperformed all the 11 SOTA models in both the projection and image domains for 1/4 and 1/8 sparse-view reconstruction tasks. The proposed DLMPS approach effectively improves the quality of sparse-view CBCT images in IGRT by accurately synthesizing missing projections, exhibiting potential in substantially reducing imaging dose to patients with minimal loss of image quality.
Collapse
Affiliation(s)
- Xuzhi Zhao
- School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, China
| | - Yi Du
- Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Department of Radiation Oncology, Peking University Cancer Hospital & Institute, Beijing, China.
- Institute of Medical Technology, Peking University Health Science Center, Beijing, China.
| | - Yahui Peng
- School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, China.
| |
Collapse
|
6
|
Wu J, Jiang X, Zhong L, Zheng W, Li X, Lin J, Li Z. Linear diffusion noise boosted deep image prior for unsupervised sparse-view CT reconstruction. Phys Med Biol 2024; 69:165029. [PMID: 39119998 DOI: 10.1088/1361-6560/ad69f7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2024] [Accepted: 07/31/2024] [Indexed: 08/10/2024]
Abstract
Objective.Deep learning has markedly enhanced the performance of sparse-view computed tomography reconstruction. However, the dependence of these methods on supervised training using high-quality paired datasets, and the necessity for retraining under varied physical acquisition conditions, constrain their generalizability across new imaging contexts and settings.Approach.To overcome these limitations, we propose an unsupervised approach grounded in the deep image prior framework. Our approach advances beyond the conventional single noise level input by incorporating multi-level linear diffusion noise, significantly mitigating the risk of overfitting. Furthermore, we embed non-local self-similarity as a deep implicit prior within a self-attention network structure, improving the model's capability to identify and utilize repetitive patterns throughout the image. Additionally, leveraging imaging physics, gradient backpropagation is performed between the image domain and projection data space to optimize network weights.Main Results.Evaluations with both simulated and clinical cases demonstrate our method's effective zero-shot adaptability across various projection views, highlighting its robustness and flexibility. Additionally, our approach effectively eliminates noise and streak artifacts while significantly restoring intricate image details.Significance. Our method aims to overcome the limitations in current supervised deep learning-based sparse-view CT reconstruction, offering improved generalizability and adaptability without the need for extensive paired training data.
Collapse
Affiliation(s)
- Jia Wu
- School of Communications and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, People's Republic of China
- School of Medical Information and Engineering, Southwest Medical University, Luzhou 646000, People's Republic of China
| | - Xiaoming Jiang
- Chongqing Engineering Research Center of Medical Electronics and Information Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, People's Republic of China
| | - Lisha Zhong
- School of Medical Information and Engineering, Southwest Medical University, Luzhou 646000, People's Republic of China
| | - Wei Zheng
- Key Laboratory of Big Data Intelligent Computing, Chongqing University of Posts and Telecommunications, Chongqing 400065, People's Republic of China
| | - Xinwei Li
- Chongqing Engineering Research Center of Medical Electronics and Information Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, People's Republic of China
| | - Jinzhao Lin
- School of Communications and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, People's Republic of China
| | - Zhangyong Li
- Chongqing Engineering Research Center of Medical Electronics and Information Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, People's Republic of China
| |
Collapse
|