1
|
Allapakam V, Karuna Y. An ensemble deep learning model for medical image fusion with Siamese neural networks and VGG-19. PLoS One 2024; 19:e0309651. [PMID: 39441782 PMCID: PMC11498686 DOI: 10.1371/journal.pone.0309651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 08/15/2024] [Indexed: 10/25/2024] Open
Abstract
Multimodal medical image fusion methods, which combine complementary information from many multi-modality medical images, are among the most important and practical approaches in numerous clinical applications. Various conventional image fusion techniques have been developed for multimodality image fusion. Complex procedures for weight map computing, fixed fusion strategy and lack of contextual understanding remain difficult in conventional and machine learning approaches, usually resulting in artefacts that degrade the image quality. This work proposes an efficient hybrid learning model for medical image fusion using pre-trained and non-pre-trained networks i.e. VGG-19 and SNN with stacking ensemble method. The model leveraging the unique capabilities of each architecture, can effectively preserve the detailed information with high visual quality, for numerous combinations of image modalities in image fusion challenges, notably improved contrast, increased resolution, and lower artefacts. Additionally, this ensemble model can be more robust in the fusion of various combinations of source images that are publicly available from Havard-Medical-Image-Fusion Datasets, GitHub. and Kaggle. Our proposed model performance is superior in terms of visual quality and performance metrics to that of the existing fusion methods in literature like PCA+DTCWT, NSCT, DWT, DTCWT+NSCT, GADCT, CNN and VGG-19.
Collapse
Affiliation(s)
- Venu Allapakam
- School of Electronics Engineering, Vellore Institute of Technology, Vellore, India
| | - Yepuganti Karuna
- School of Electronics Engineering, VIT-AP University, Amaravathi, India
| |
Collapse
|
2
|
Chen Y, Liu A, Liu Y, He Z, Liu C, Chen X. Multi-Dimensional Medical Image Fusion With Complex Sparse Representation. IEEE Trans Biomed Eng 2024; 71:2728-2739. [PMID: 38652633 DOI: 10.1109/tbme.2024.3391314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/25/2024]
Abstract
In the field of medical imaging, the fusion of data from diverse modalities plays a pivotal role in advancing our understanding of pathological conditions. Sparse representation (SR), a robust signal modeling technique, has demonstrated noteworthy success in multi-dimensional (MD) medical image fusion. However, a fundamental limitation appearing in existing SR models is their lack of directionality, restricting their efficacy in extracting anatomical details from different imaging modalities. To tackle this issue, we propose a novel directional SR model, termed complex sparse representation (ComSR), specifically designed for medical image fusion. ComSR independently represents MD signals over directional dictionaries along specific directions, allowing precise analysis of intricate details of MD signals. Besides, current studies in medical image fusion mostly concentrate on addressing either 2D or 3D fusion problems. This work bridges this gap by proposing a MD medical image fusion method based on ComSR, presenting a unified framework for both 2D and 3D fusion tasks. Experimental results across six multi-modal medical image fusion tasks, involving 93 pairs of 2D source images and 20 pairs of 3D source images, substantiate the superiority of our proposed method over 11 state-of-the-art 2D fusion methods and 4 representative 3D fusion methods, in terms of both visual quality and objective evaluation.
Collapse
|
3
|
Liu Y, Mu F, Shi Y, Cheng J, Li C, Chen X. Brain tumor segmentation in multimodal MRI via pixel-level and feature-level image fusion. Front Neurosci 2022; 16:1000587. [PMID: 36188482 PMCID: PMC9515796 DOI: 10.3389/fnins.2022.1000587] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 08/18/2022] [Indexed: 11/30/2022] Open
Abstract
Brain tumor segmentation in multimodal MRI volumes is of great significance to disease diagnosis, treatment planning, survival prediction and other relevant tasks. However, most existing brain tumor segmentation methods fail to make sufficient use of multimodal information. The most common way is to simply stack the original multimodal images or their low-level features as the model input, and many methods treat each modality data with equal importance to a given segmentation target. In this paper, we introduce multimodal image fusion technique including both pixel-level fusion and feature-level fusion for brain tumor segmentation, aiming to achieve more sufficient and finer utilization of multimodal information. At the pixel level, we present a convolutional network named PIF-Net for 3D MR image fusion to enrich the input modalities of the segmentation model. The fused modalities can strengthen the association among different types of pathological information captured by multiple source modalities, leading to a modality enhancement effect. At the feature level, we design an attention-based modality selection feature fusion (MSFF) module for multimodal feature refinement to address the difference among multiple modalities for a given segmentation target. A two-stage brain tumor segmentation framework is accordingly proposed based on the above components and the popular V-Net model. Experiments are conducted on the BraTS 2019 and BraTS 2020 benchmarks. The results demonstrate that the proposed components on both pixel-level and feature-level fusion can effectively improve the segmentation accuracy of brain tumors.
Collapse
Affiliation(s)
- Yu Liu
- Department of Biomedical Engineering, Hefei University of Technology, Hefei, China
- Anhui Province Key Laboratory of Measuring Theory and Precision Instrument, Hefei University of Technology, Hefei, China
| | - Fuhao Mu
- Department of Biomedical Engineering, Hefei University of Technology, Hefei, China
| | - Yu Shi
- Department of Biomedical Engineering, Hefei University of Technology, Hefei, China
| | - Juan Cheng
- Department of Biomedical Engineering, Hefei University of Technology, Hefei, China
- Anhui Province Key Laboratory of Measuring Theory and Precision Instrument, Hefei University of Technology, Hefei, China
| | - Chang Li
- Department of Biomedical Engineering, Hefei University of Technology, Hefei, China
- Anhui Province Key Laboratory of Measuring Theory and Precision Instrument, Hefei University of Technology, Hefei, China
| | - Xun Chen
- Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei, China
- *Correspondence: Xun Chen
| |
Collapse
|
4
|
Li W, Li R, Fu J, Peng X. MSENet: A multi-scale enhanced network based on unique features guidance for medical image fusion. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103534] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
5
|
Discrete Shearlets as a Sparsifying Transform in Low-Rank Plus Sparse Decomposition for Undersampled (k, t)-Space MR Data. J Imaging 2022; 8:jimaging8020029. [PMID: 35200731 PMCID: PMC8878450 DOI: 10.3390/jimaging8020029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 01/19/2022] [Accepted: 01/27/2022] [Indexed: 11/17/2022] Open
Abstract
The discrete shearlet transformation accurately represents the discontinuities and edges occurring in magnetic resonance imaging, providing an excellent option of a sparsifying transform. In the present paper, we examine the use of discrete shearlets over other sparsifying transforms in a low-rank plus sparse decomposition problem, denoted by L+S. The proposed algorithm is evaluated on simulated dynamic contrast enhanced (DCE) and small bowel data. For the small bowel, eight subjects were scanned; the sequence was run first on breath-holding and subsequently on free-breathing, without changing the anatomical position of the subject. The reconstruction performance of the proposed algorithm was evaluated against k-t FOCUSS. L+S decomposition, using discrete shearlets as sparsifying transforms, successfully separated the low-rank (background and periodic motion) from the sparse component (enhancement or bowel motility) for both DCE and small bowel data. Motion estimated from low-rank of DCE data is closer to ground truth deformations than motion estimated from L and S. Motility metrics derived from the S component of free-breathing data were not significantly different from the ones from breath-holding data up to four-fold undersampling, indicating that bowel (rapid/random) motility is isolated in S. Our work strongly supports the use of discrete shearlets as a sparsifying transform in a L+S decomposition for undersampled MR data.
Collapse
|
6
|
FuseVis: Interpreting Neural Networks for Image Fusion Using Per-Pixel Saliency Visualization. COMPUTERS 2020. [DOI: 10.3390/computers9040098] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Image fusion helps in merging two or more images to construct a more informative single fused image. Recently, unsupervised learning-based convolutional neural networks (CNN) have been used for different types of image-fusion tasks such as medical image fusion, infrared-visible image fusion for autonomous driving as well as multi-focus and multi-exposure image fusion for satellite imagery. However, it is challenging to analyze the reliability of these CNNs for the image-fusion tasks since no groundtruth is available. This led to the use of a wide variety of model architectures and optimization functions yielding quite different fusion results. Additionally, due to the highly opaque nature of such neural networks, it is difficult to explain the internal mechanics behind its fusion results. To overcome these challenges, we present a novel real-time visualization tool, named FuseVis, with which the end-user can compute per-pixel saliency maps that examine the influence of the input image pixels on each pixel of the fused image. We trained several image fusion-based CNNs on medical image pairs and then using our FuseVis tool we performed case studies on a specific clinical application by interpreting the saliency maps from each of the fusion methods. We specifically visualized the relative influence of each input image on the predictions of the fused image and showed that some of the evaluated image-fusion methods are better suited for the specific clinical application. To the best of our knowledge, currently, there is no approach for visual analysis of neural networks for image fusion. Therefore, this work opens a new research direction to improve the interpretability of deep fusion networks. The FuseVis tool can also be adapted in other deep neural network-based image processing applications to make them interpretable.
Collapse
|
7
|
Tang W, Liu Y, Cheng J, Li C, Peng H, Chen X. A phase congruency-based green fluorescent protein and phase contrast image fusion method in nonsubsampled shearlet transform domain. Microsc Res Tech 2020; 83:1225-1234. [PMID: 32472956 DOI: 10.1002/jemt.23514] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2019] [Accepted: 05/05/2020] [Indexed: 11/08/2022]
Abstract
Image fusion technique is an effective way to merge the information contained in different imaging modalities by generating a more informative composite image. Fusion of green fluorescent protein (GFP) and phase contrast images is of great significance to the subcellular localization, the functional analysis of protein, and the expression of gene. In this article, a phase congruency (PC)-based GFP and phase contrast image fusion method in nonsubsampled shearlet transform (NSST) domain is presented. The input images are decomposed by the NSST to acquire the multiscale and multidirection representations. The high-frequency coefficients are fused with a strategy based on PC and parameter-adaptive pulse coupled neural network (PA-PCNN), while the low-frequency coefficients are integrated through a local energy (LE)-based rule. Finally, the fused image is generated by conducting the inverse NSST on the merged high- and low-frequency coefficients. Experimental results illustrate that the presented method outperforms several state-of-the-art GFP and phase contrast image fusion algorithms on both qualitative and quantitative assessments.
Collapse
Affiliation(s)
- Wei Tang
- Department of Biomedical Engineering, Hefei University of Technology, Hefei, China
| | - Yu Liu
- Department of Biomedical Engineering, Hefei University of Technology, Hefei, China
| | - Juan Cheng
- Department of Biomedical Engineering, Hefei University of Technology, Hefei, China
| | - Chang Li
- Department of Biomedical Engineering, Hefei University of Technology, Hefei, China
| | - Hu Peng
- Department of Biomedical Engineering, Hefei University of Technology, Hefei, China
| | - Xun Chen
- Department of Electronic Science and Technology, University of Science and Technology of China, Hefei, China
| |
Collapse
|
8
|
Wang K, Zheng M, Wei H, Qi G, Li Y. Multi-Modality Medical Image Fusion Using Convolutional Neural Network and Contrast Pyramid. SENSORS (BASEL, SWITZERLAND) 2020; 20:E2169. [PMID: 32290472 PMCID: PMC7218740 DOI: 10.3390/s20082169] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/07/2020] [Revised: 04/06/2020] [Accepted: 04/08/2020] [Indexed: 12/21/2022]
Abstract
Medical image fusion techniques can fuse medical images from different morphologies to make the medical diagnosis more reliable and accurate, which play an increasingly important role in many clinical applications. To obtain a fused image with high visual quality and clear structure details, this paper proposes a convolutional neural network (CNN) based medical image fusion algorithm. The proposed algorithm uses the trained Siamese convolutional network to fuse the pixel activity information of source images to realize the generation of weight map. Meanwhile, a contrast pyramid is implemented to decompose the source image. According to different spatial frequency bands and a weighted fusion operator, source images are integrated. The results of comparative experiments show that the proposed fusion algorithm can effectively preserve the detailed structure information of source images and achieve good human visual effects.
Collapse
Affiliation(s)
- Kunpeng Wang
- School of Information Engineering, Southwest University of Science and Technology, Mianyang 621010, China;
- Robot Technology Used for Special Environment Key Laboratory of Sichuan Province, Mianyang 621010, China
| | - Mingyao Zheng
- College of Automation, Chongqing University of Posts and Telecommunications, Chongqing 400065, China; (M.Z.); (H.W.)
| | - Hongyan Wei
- College of Automation, Chongqing University of Posts and Telecommunications, Chongqing 400065, China; (M.Z.); (H.W.)
| | - Guanqiu Qi
- Computer Information Systems Department, State University of New York at Buffalo State, Buffalo, NY 14222, USA;
| | - Yuanyuan Li
- College of Automation, Chongqing University of Posts and Telecommunications, Chongqing 400065, China; (M.Z.); (H.W.)
| |
Collapse
|
9
|
Application of Image Fusion in Diagnosis and Treatment of Liver Cancer. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10031171] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
With the accelerated development of medical imaging equipment and techniques, image fusion technology has been effectively applied for diagnosis, biopsy and radiofrequency ablation, especially for liver tumor. Tumor treatment relying on a single medical imaging modality might face challenges, due to the deep positioning of the lesions, operation history and the specific background conditions of the liver disease. Image fusion technology has been employed to address these challenges. Using the image fusion technology, one could obtain real-time anatomical imaging superimposed by functional images showing the same plane to facilitate the diagnosis and treatments of liver tumors. This paper presents a review of the key principles of image fusion technology, its application in tumor treatments, particularly in liver tumors, and concludes with a discussion of the limitations and prospects of the image fusion technology.
Collapse
|
10
|
Wang Y, Wang Y. Fusion of 3-D medical image gradient domain based on detail-driven and directional structure tensor. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2020; 28:1001-1016. [PMID: 32675434 DOI: 10.3233/xst-200684] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
BACKGROUND Multi-modal medical image fusion plays a crucial role in many areas of modern medicine like diagnosis and therapy planning. OBJECTIVE Due to the factor that the structure tensor has the property of preserving the image geometry, we utilized it to construct the directional structure tensor and further proposed an improved 3-D medical image fusion method. METHOD The local entropy metrics were used to construct the gradient weights of different source images, and the eigenvectors of traditional structure tensor were combined with the second-order derivatives of image to construct the directional structure tensor. In addition, the guided filtering was employed to obtain detail components of the source images and construct a fused gradient field with the enhanced detail. Finally, the fusion image was generated by solving the functional minimization problem. RESULTS AND CONCLUSION Experimental results demonstrated that this new method is superior to the traditional structure tensor and multi-scale analysis in both visual effect and quantitative assessment.
Collapse
Affiliation(s)
- Yu Wang
- School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology, Shanghai, China
| | - Yuanjun Wang
- School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology, Shanghai, China
| |
Collapse
|
11
|
Yang Y, Wu J, Huang S, Fang Y, Lin P, Que Y. Multimodal Medical Image Fusion Based on Fuzzy Discrimination With Structural Patch Decomposition. IEEE J Biomed Health Inform 2019; 23:1647-1660. [DOI: 10.1109/jbhi.2018.2869096] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
12
|
A New Deep Learning Based Multi-Spectral Image Fusion Method. ENTROPY 2019; 21:e21060570. [PMID: 33267284 PMCID: PMC7515058 DOI: 10.3390/e21060570] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/12/2019] [Revised: 06/02/2019] [Accepted: 06/03/2019] [Indexed: 11/16/2022]
Abstract
In this paper, we present a new effective infrared (IR) and visible (VIS) image fusion method by using a deep neural network. In our method, a Siamese convolutional neural network (CNN) is applied to automatically generate a weight map which represents the saliency of each pixel for a pair of source images. A CNN plays a role in automatic encoding an image into a feature domain for classification. By applying the proposed method, the key problems in image fusion, which are the activity level measurement and fusion rule design, can be figured out in one shot. The fusion is carried out through the multi-scale image decomposition based on wavelet transform, and the reconstruction result is more perceptual to a human visual system. In addition, the visual qualitative effectiveness of the proposed fusion method is evaluated by comparing pedestrian detection results with other methods, by using the YOLOv3 object detector using a public benchmark dataset. The experimental results show that our proposed method showed competitive results in terms of both quantitative assessment and visual quality.
Collapse
|
13
|
Arif M, Wang G. Fast curvelet transform through genetic algorithm for multimodal medical image fusion. Soft comput 2019. [DOI: 10.1007/s00500-019-04011-5] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
14
|
Tensor Sparse Representation for 3-D Medical Image Fusion Using Weighted Average Rule. IEEE Trans Biomed Eng 2018; 65:2622-2633. [PMID: 29993511 DOI: 10.1109/tbme.2018.2811243] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
OBJECTIVE The technique of fusing multimodal medical images into single image has a great impact on the clinical diagnosis. The previous works mostly concern the two-dimensional (2-D) image fusion performed on each slice individually, that may destroy the 3-D correlation across adjacent slices. To address this issue, this paper proposes a novel 3-D image fusion scheme based on Tensor Sparse Representation (TSR). METHODS First, each medical volume is arranged as a three-order tensor, and represented by TSR with learned dictionaries. Second, a novel "weighted average" rule, calculated from the tensor sparse coefficients using 3-D local-to-global strategy. The weights are then employed to combine the multimodal medical volumes through weighted average. RESULTS The visual and objective comparisons show that the proposed method is competitive to the existing methods on various medical volumes in different imaging modalities. CONCLUSION The TSR-based 3-D fusion approach with weighted average rule can preserve the 3-D structure of medical volume, and reduce the low contrast and artifacts in fused product. SIGNIFICANCE The designed weights offer the effective assigned weights and accurate salience levels measure, which can improve the performance of fusion approach.
Collapse
|
15
|
Log-Gabor energy based multimodal medical image fusion in NSCT domain. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2014; 2014:835481. [PMID: 25214889 PMCID: PMC4158263 DOI: 10.1155/2014/835481] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2014] [Revised: 08/05/2014] [Accepted: 08/06/2014] [Indexed: 11/25/2022]
Abstract
Multimodal medical image fusion is a powerful tool in clinical applications such as noninvasive diagnosis, image-guided radiotherapy, and treatment planning. In this paper, a novel nonsubsampled Contourlet transform (NSCT) based method for multimodal medical image fusion is presented, which is approximately shift invariant and can effectively suppress the pseudo-Gibbs phenomena. The source medical images are initially transformed by NSCT followed by fusing low- and high-frequency components. The phase congruency that can provide a contrast and brightness-invariant representation is applied to fuse low-frequency coefficients, whereas the Log-Gabor energy that can efficiently determine the frequency coefficients from the clear and detail parts is employed to fuse the high-frequency coefficients. The proposed fusion method has been compared with the discrete wavelet transform (DWT), the fast discrete curvelet transform (FDCT), and the dual tree complex wavelet transform (DTCWT) based image fusion methods and other NSCT-based methods. Visually and quantitatively experimental results indicate that the proposed fusion method can obtain more effective and accurate fusion results of multimodal medical images than other algorithms. Further, the applicability of the proposed method has been testified by carrying out a clinical example on a woman affected with recurrent tumor images.
Collapse
|