1
|
Sun J, Chen B, Lu R, Cheng Z, Qu C, Yuan X. Advancing Hyperspectral and Multispectral Image Fusion: An Information-Aware Transformer-Based Unfolding Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:7407-7421. [PMID: 38776209 DOI: 10.1109/tnnls.2024.3400809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2024]
Abstract
In hyperspectral image (HSI) processing, the fusion of the high-resolution multispectral image (HR-MSI) and the low-resolution HSI (LR-HSI) on the same scene, known as MSI-HSI fusion, is a crucial step in obtaining the desired high-resolution HSI (HR-HSI). With the powerful representation ability, convolutional neural network (CNN)-based deep unfolding methods have demonstrated promising performances. However, limited receptive fields of CNN often lead to inaccurate long-range spatial features, and inherent input and output images for each stage in unfolding networks restrict the feature transmission, thus limiting the overall performance. To this end, we propose a novel and efficient information-aware transformer-based unfolding network (ITU-Net) to model the long-range dependencies and transfer more information across the stages. Specifically, we employ a customized transformer block to learn representations from both the spatial and frequency domains as well as avoid the quadratic complexity with respect to the input length. For spatial feature extractions, we develop an information transfer guided linearized attention (ITLA), which transmits high-throughput information between adjacent stages and extracts contextual features along the spatial dimension in linear complexity. Moreover, we introduce frequency domain learning in the feedforward network (FFN) to capture token variations of the image and narrow the frequency gap. Via integrating our proposed transformer blocks with the unfolding framework, our ITU-Net achieves state-of-the-art (SOTA) performance on both synthetic and real hyperspectral datasets.
Collapse
|
2
|
Wu X, Cao ZH, Huang TZ, Deng LJ, Chanussot J, Vivone G. Fully-Connected Transformer for Multi-Source Image Fusion. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2025; 47:2071-2088. [PMID: 40031431 DOI: 10.1109/tpami.2024.3523364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Multi-source image fusion combines the information coming from multiple images into one data, thus improving imaging quality. This topic has aroused great interest in the community. How to integrate information from different sources is still a big challenge, although the existing self-attention based transformer methods can capture spatial and channel similarities. In this paper, we first discuss the mathematical concepts behind the proposed generalized self-attention mechanism, where the existing self-attentions are considered basic forms. The proposed mechanism employs multilinear algebra to drive the development of a novel fully-connected self-attention (FCSA) method to fully exploit local and non-local domain-specific correlations among multi-source images. Moreover, we propose a multi-source image representation embedding it into the FCSA framework as a non-local prior within an optimization problem. Some different fusion problems are unfolded into the proposed fully-connected transformer fusion network (FC-Former). More specifically, the concept of generalized self-attention can promote the potential development of self-attention. Hence, the FC-Former can be viewed as a network model unifying different fusion tasks. Compared with state-of-the-art methods, the proposed FC-Former method exhibits robust and superior performance, showing its capability of faithfully preserving information.
Collapse
|
3
|
Dian R, Liu Y, Li S. Spectral Super-Resolution via Deep Low-Rank Tensor Representation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:5140-5150. [PMID: 38466604 DOI: 10.1109/tnnls.2024.3359852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
Spectral super-resolution has attracted the attention of more researchers for obtaining hyperspectral images (HSIs) in a simpler and cheaper way. Although many convolutional neural network (CNN)-based approaches have yielded impressive results, most of them ignore the low-rank prior of HSIs resulting in huge computational and storage costs. In addition, the ability of CNN-based methods to capture the correlation of global information is limited by the receptive field. To surmount the problem, we design a novel low-rank tensor reconstruction network (LTRN) for spectral super-resolution. Specifically, we treat the features of HSIs as 3-D tensors with low-rank properties due to their spectral similarity and spatial sparsity. Then, we combine canonical-polyadic (CP) decomposition with neural networks to design an adaptive low-rank prior learning (ALPL) module that enables feature learning in a 1-D space. In this module, there are two core modules: the adaptive vector learning (AVL) module and the multidimensionwise multihead self-attention (MMSA) module. The AVL module is designed to compress an HSI into a 1-D space by using a vector to represent its information. The MMSA module is introduced to improve the ability to capture the long-range dependencies in the row, column, and spectral dimensions, respectively. Finally, our LTRN, mainly cascaded by several ALPL modules and feedforward networks (FFNs), achieves high-quality spectral super-resolution with fewer parameters. To test the effect of our method, we conduct experiments on two datasets: the CAVE dataset and the Harvard dataset. Experimental results show that our LTRN not only is as effective as state-of-the-art methods but also has fewer parameters. The code is available at https://github.com/renweidian/LTRN.
Collapse
|
4
|
Li J, Du S, Song R, Li Y, Du Q. Progressive Spatial Information-Guided Deep Aggregation Convolutional Network for Hyperspectral Spectral Super-Resolution. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:1677-1691. [PMID: 37889820 DOI: 10.1109/tnnls.2023.3325682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/29/2023]
Abstract
Fusion-based spectral super-resolution aims to yield a high-resolution hyperspectral image (HR-HSI) by integrating the available high-resolution multispectral image (HR-MSI) with the corresponding low-resolution hyperspectral image (LR-HSI). With the prosperity of deep convolutional neural networks, plentiful fusion methods have made breakthroughs in reconstruction performance promotions. Nevertheless, due to inadequate and improper utilization of cross-modality information, the most current state-of-the-art (SOTA) fusion-based methods cannot produce very satisfactory recovery quality and only yield desired results with a small upsampling scale, thus affecting the practical applications. In this article, we propose a novel progressive spatial information-guided deep aggregation convolutional neural network (SIGnet) for enhancing the performance of hyperspectral image (HSI) spectral super-resolution (SSR), which is decorated through several dense residual channel affinity learning (DRCA) blocks cooperating with a spatial-guided propagation (SGP) module as the backbone. Specifically, the DRCA block consists of an encoding part and a decoding part connected by a channel affinity propagation (CAP) module and several cross-layer skip connections. In detail, the CAP module is customized by exploiting the channel affinity matrix to model correlations among channels of the feature maps for aggregating the channel-wise interdependencies of the middle layers, thereby further boosting the reconstruction accuracy. Additionally, to efficiently utilize the two cross-modality information, we developed an innovative SGP module equipped with a simulation of the degradation part and a deformable adaptive fusion part, which is capable of refining the coarse HSI feature maps at pixel-level progressively. Extensive experimental results demonstrate the superiority of our proposed SIGnet over several SOTA fusion-based algorithms.
Collapse
|
5
|
Dong W, Yang Y, Qu J, Li Y, Yang Y, Jia X. Feature Pyramid Fusion Network for Hyperspectral Pansharpening. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:1555-1567. [PMID: 37889826 DOI: 10.1109/tnnls.2023.3325887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/29/2023]
Abstract
Hyperspectral (HS) pansharpening aims at fusing an observed HS image with a panchromatic (PAN) image, to produce an image with the high spectral resolution of the former and the high spatial resolution of the latter. Most of the existing convolutional neural networks (CNNs)-based pansharpening methods reconstruct the desired high-resolution image from the encoded low-resolution (LR) representation. However, the encoded LR representation captures semantic information of the image and is inadequate in reconstructing fine details. How to effectively extract high-resolution and LR representations for high-resolution image reconstruction is the main objective of this article. In this article, we propose a feature pyramid fusion network (FPFNet) for pansharpening, which permits the network to extract multiresolution representations from PAN and HS images in two branches. The PAN branch starts from the high-resolution stream that maintains the spatial resolution of the PAN image and gradually adds LR streams in parallel. The structure of the HS branch remains highly consistent with that of the PAN branch, but starts with the LR stream and gradually adds high-resolution streams. The representations with corresponding resolutions of PAN and HS branches are fused and gradually upsampled in a coarse to fine manner to reconstruct the high-resolution HS image. Experimental results on three datasets demonstrate the significant superiority of the proposed FPFNet over the state-of-the-art methods in terms of both qualitative and quantitative comparisons.
Collapse
|
6
|
Zhou X, You Z, Sun W, Zhao D, Yan S. Fractional-order stochastic gradient descent method with momentum and energy for deep neural networks. Neural Netw 2025; 181:106810. [PMID: 39447432 DOI: 10.1016/j.neunet.2024.106810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Revised: 08/17/2024] [Accepted: 10/11/2024] [Indexed: 10/26/2024]
Abstract
In this paper, a novel fractional-order stochastic gradient descent with momentum and energy (FOSGDME) approach is proposed. Specifically, to address the challenge of converging to a real extreme point encountered by the existing fractional gradient algorithms, a novel fractional-order stochastic gradient descent (FOSGD) method is presented by modifying the definition of the Caputo fractional-order derivative. A FOSGD with moment (FOSGDM) is established by incorporating momentum information to accelerate the convergence speed and accuracy further. In addition, to improve the robustness and accuracy, a FOSGD with moment and energy is established by further introducing energy formation. The extensive experimental results on the image classification CIFAR-10 dataset obtained with ResNet and DenseNet demonstrate that the proposed FOSGD, FOSGDM and FOSGDME algorithms are superior to the integer order optimization algorithms, and achieve state-of-the-art performance.
Collapse
Affiliation(s)
- Xingwen Zhou
- School of Information Science and Engineering, Lanzhou University, 222 Tianshui South Road, Chengguan District, Lanzhou, Gansu Province, Lanzhou, 730000, Gansu, China; School of Nuclear Science and Technology, Lanzhou University, 222 Tianshui South Road, Chengguan District, Lanzhou, Gansu Province, Lanzhou, 730000, Gansu, China
| | - Zhenghao You
- School of Information Science and Engineering, Lanzhou University, 222 Tianshui South Road, Chengguan District, Lanzhou, Gansu Province, Lanzhou, 730000, Gansu, China
| | - Weiguo Sun
- School of Information Science and Engineering, Lanzhou University, 222 Tianshui South Road, Chengguan District, Lanzhou, Gansu Province, Lanzhou, 730000, Gansu, China
| | - Dongdong Zhao
- School of Information Science and Engineering, Lanzhou University, 222 Tianshui South Road, Chengguan District, Lanzhou, Gansu Province, Lanzhou, 730000, Gansu, China
| | - Shi Yan
- School of Information Science and Engineering, Lanzhou University, 222 Tianshui South Road, Chengguan District, Lanzhou, Gansu Province, Lanzhou, 730000, Gansu, China.
| |
Collapse
|
7
|
Zhu Y, Fu X, Zhang Z, Liu A, Xiong Z, Zha ZJ. Hue Guidance Network for Single Image Reflection Removal. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:13701-13712. [PMID: 37220051 DOI: 10.1109/tnnls.2023.3270938] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Reflection from glasses is ubiquitous in daily life, but it is usually undesirable in photographs. To remove these unwanted noises, existing methods utilize either correlative auxiliary information or handcrafted priors to constrain this ill-posed problem. However, due to their limited capability to describe the properties of reflections, these methods are unable to handle strong and complex reflection scenes. In this article, we propose a hue guidance network (HGNet) with two branches for single image reflection removal (SIRR) by integrating image information and corresponding hue information. The complementarity between image information and hue information has not been noticed. The key to this idea is that we found that hue information can describe reflections well and thus can be used as a superior constraint for the specific SIRR task. Accordingly, the first branch extracts the salient reflection features by directly estimating the hue map. The second branch leverages these effective features, which can help locate salient reflection regions to obtain a high-quality restored image. Furthermore, we design a new cyclic hue loss to provide a more accurate optimization direction for the network training. Experiments substantiate the superiority of our network, especially its excellent generalization ability to various reflection scenes, as compared with state-of-the-arts both qualitatively and quantitatively. Source codes are available at https://github.com/zhuyr97/HGRR.
Collapse
|
8
|
Wang Y, Li W, Liu N, Gui Y, Tao R. FuBay: An Integrated Fusion Framework for Hyperspectral Super-Resolution Based on Bayesian Tensor Ring. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:14712-14726. [PMID: 37327099 DOI: 10.1109/tnnls.2023.3281355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Fusion with corresponding finer-resolution images has been a promising way to enhance hyperspectral images (HSIs) spatially. Recently, low-rank tensor-based methods have shown advantages compared with other kind of ones. However, these current methods either relent to blind manual selection of latent tensor rank, whereas the prior knowledge about tensor rank is surprisingly limited, or resort to regularization to make the role of low rankness without exploration on the underlying low-dimensional factors, both of which are leaving the computational burden of parameter tuning. To address that, a novel Bayesian sparse learning-based tensor ring (TR) fusion model is proposed, named as FuBay. Through specifying hierarchical sprasity-inducing prior distribution, the proposed method becomes the first fully Bayesian probabilistic tensor framework for hyperspectral fusion. With the relationship between component sparseness and the corresponding hyperprior parameter being well studied, a component pruning part is established to asymptotically approaching true latent rank. Furthermore, a variational inference (VI)-based algorithm is derived to learn the posterior of TR factors, circumventing nonconvex optimization that bothers the most tensor decomposition-based fusion methods. As a Bayesian learning methods, our model is characterized to be parameter tuning-free. Finally, extensive experiments demonstrate its superior performance when compared with state-of-the-art methods.
Collapse
|
9
|
Zhang Q, Zheng Y, Yuan Q, Song M, Yu H, Xiao Y. Hyperspectral Image Denoising: From Model-Driven, Data-Driven, to Model-Data-Driven. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:13143-13163. [PMID: 37279128 DOI: 10.1109/tnnls.2023.3278866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Mixed noise pollution in HSI severely disturbs subsequent interpretations and applications. In this technical review, we first give the noise analysis in different noisy HSIs and conclude crucial points for programming HSI denoising algorithms. Then, a general HSI restoration model is formulated for optimization. Later, we comprehensively review existing HSI denoising methods, from model-driven strategy (nonlocal mean, total variation, sparse representation, low-rank matrix approximation, and low-rank tensor factorization), data-driven strategy [2-D convolutional neural network (CNN), 3-D CNN, hybrid, and unsupervised networks], to model-data-driven strategy. The advantages and disadvantages of each strategy for HSI denoising are summarized and contrasted. Behind this, we present an evaluation of the HSI denoising methods for various noisy HSIs in simulated and real experiments. The classification results of denoised HSIs and execution efficiency are depicted through these HSI denoising methods. Finally, prospects of future HSI denoising methods are listed in this technical review to guide the ongoing road for HSI denoising. The HSI denoising dataset could be found at https://qzhang95.github.io.
Collapse
|
10
|
Yang J, Xiao L, Zhao YQ, Chan JCW. Unsupervised Deep Tensor Network for Hyperspectral-Multispectral Image Fusion. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:13017-13031. [PMID: 37134042 DOI: 10.1109/tnnls.2023.3266038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Fusing low-resolution (LR) hyperspectral images (HSIs) with high-resolution (HR) multispectral images (MSIs) is a significant technology to enhance the resolution of HSIs. Despite the encouraging results from deep learning (DL) in HSI-MSI fusion, there are still some issues. First, the HSI is a multidimensional signal, and the representability of current DL networks for multidimensional features has not been thoroughly investigated. Second, most DL HSI-MSI fusion networks need HR HSI ground truth for training, but it is often unavailable in reality. In this study, we integrate tensor theory with DL and propose an unsupervised deep tensor network (UDTN) for HSI-MSI fusion. We first propose a tensor filtering layer prototype and further build a coupled tensor filtering module. It jointly represents the LR HSI and HR MSI as several features revealing the principal components of spectral and spatial modes and a sharing code tensor describing the interaction among different modes. Specifically, the features on different modes are represented by the learnable filters of tensor filtering layers, the sharing code tensor is learned by a projection module, in which a co-attention is proposed to encode the LR HSI and HR MSI and then project them onto the sharing code tensor. The coupled tensor filtering module and projection module are jointly trained from the LR HSI and HR MSI in an unsupervised and end-to-end way. The latent HR HSI is inferred with the sharing code tensor, the features on spatial modes of HR MSIs, and the spectral mode of LR HSIs. Experiments on simulated and real remote-sensing datasets demonstrate the effectiveness of the proposed method.
Collapse
|
11
|
Wu C, Li J, Song R, Li Y, Du Q. HPRN: Holistic Prior-Embedded Relation Network for Spectral Super-Resolution. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:11409-11423. [PMID: 37030818 DOI: 10.1109/tnnls.2023.3260828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Spectral super-resolution (SSR) refers to the hyperspectral image (HSI) recovery from an RGB counterpart. Due to the one-to-many nature of the SSR problem, a single RGB image can be reprojected to many HSIs. The key to tackle this ill-posed problem is to plug into multisource prior information such as the natural spatial context prior of RGB images, deep feature prior, or inherent statistical prior of HSIs so as to effectively alleviate the degree of ill-posedness. However, most current approaches only consider the general and limited priors in their customized convolutional neural networks (CNNs), which leads to the inability to guarantee the confidence and fidelity of reconstructed spectra. In this article, we propose a novel holistic prior-embedded relation network (HPRN) to integrate comprehensive priors to regularize and optimize the solution space of SSR. Basically, the core framework is delicately assembled by several multiresidual relation blocks (MRBs) that fully facilitate the transmission and utilization of the low-frequency content prior of RGBs. Innovatively, the semantic prior of RGB inputs is introduced to mark category attributes, and a semantic-driven spatial relation module (SSRM) is invented to perform the feature aggregation of clustered similar ranges for refining recovered characteristics. In addition, we develop a transformer-based channel relation module (TCRM), which breaks the habit of employing scalars as the descriptors of channelwise relations in the previous deep feature prior and replaces them with certain vectors to make the mapping function more robust and smoother. In order to maintain the mathematical correlation and spectral consistency between hyperspectral bands, the second-order prior constraints (SOPCs) are incorporated into the loss function to guide the HSI reconstruction. Finally, extensive experimental results on four benchmarks demonstrate that our HPRN can reach the state-of-the-art performance for SSR quantitatively and qualitatively. Furthermore, the effectiveness and usefulness of the reconstructed spectra are verified by the classification results on the remote sensing dataset. Codes are available at https://github.com/Deep-imagelab/HPRN.
Collapse
|
12
|
Dian R, Shan T, He W, Liu H. Spectral Super-Resolution via Model-Guided Cross-Fusion Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:10059-10070. [PMID: 37022225 DOI: 10.1109/tnnls.2023.3238506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Spectral super-resolution, which reconstructs a hyperspectral image (HSI) from a single red-green-blue (RGB) image, has acquired more and more attention. Recently, convolution neural networks (CNNs) have achieved promising performance. However, they often fail to simultaneously exploit the imaging model of the spectral super-resolution and complex spatial and spectral characteristics of the HSI. To tackle the above problems, we build a novel cross fusion (CF)-based model-guided network (called SSRNet) for spectral super-resolution. In specific, based on the imaging model, we unfold the spectral super-resolution into the HSI prior learning (HPL) module and imaging model guiding (IMG) module. Instead of just modeling one kind of image prior, the HPL module is composed of two subnetworks with different structures, which can effectively learn the complex spatial and spectral priors of the HSI, respectively. Furthermore, a CF strategy is used to establish the connection between the two subnetworks, which further improves the learning performance of the CNN. The IMG module results in solving a strong convex optimization problem, which adaptively optimizes and merges the two features learned by the HPL module by exploiting the imaging model. The two modules are alternately connected to achieve optimal HSI reconstruction performance. Experiments on both the simulated and real data demonstrate that the proposed method can achieve superior spectral reconstruction results with relatively small model size. The code will be available at https://github.com/renweidian.
Collapse
|
13
|
Ye F, Wu Z, Jia X, Chanussot J, Xu Y, Wei Z. Bayesian Nonlocal Patch Tensor Factorization for Hyperspectral Image Super-Resolution. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:5877-5892. [PMID: 37889806 DOI: 10.1109/tip.2023.3326687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/29/2023]
Abstract
The synthesis of high-resolution (HR) hyperspectral image (HSI) by fusing a low-resolution HSI with a corresponding HR multispectral image has emerged as a prevalent HSI super-resolution (HSR) scheme. Recent researches have revealed that tensor analysis is an emerging tool for HSR. However, most off-the-shelf tensor-based HSR algorithms tend to encounter challenges in rank determination and modeling capacity. To address these issues, we construct nonlocal patch tensors (NPTs) and characterize low-rank structures with coupled Bayesian tensor factorization. It is worth emphasizing that the intrinsic global spectral correlation and nonlocal spatial similarity can be simultaneously explored under the proposed model. Moreover, benefiting from the technique of automatic relevance determination, we propose a hierarchical probabilistic framework based on Canonical Polyadic (CP) factorization, which incorporates a sparsity-inducing prior over the underlying factor matrices. We further develop an effective expectation-maximization-type optimization scheme for framework estimation. In contrast to existing works, the proposed model can infer the latent CP rank of NPT adaptively without tuning parameters. Extensive experiments on synthesized and real datasets illustrate the intrinsic capability of our model in rank determination as well as its superiority in fusion performance.
Collapse
|
14
|
Dian R, Guo A, Li S. Zero-Shot Hyperspectral Sharpening. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:12650-12666. [PMID: 37235456 DOI: 10.1109/tpami.2023.3279050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Fusing hyperspectral images (HSIs) with multispectral images (MSIs) of higher spatial resolution has become an effective way to sharpen HSIs. Recently, deep convolutional neural networks (CNNs) have achieved promising fusion performance. However, these methods often suffer from the lack of training data and limited generalization ability. To address the above problems, we present a zero-shot learning (ZSL) method for HSI sharpening. Specifically, we first propose a novel method to quantitatively estimate the spectral and spatial responses of imaging sensors with high accuracy. In the training procedure, we spatially subsample the MSI and HSI based on the estimated spatial response and use the downsampled HSI and MSI to infer the original HSI. In this way, we can not only exploit the inherent information in the HSI and MSI, but the trained CNN can also be well generalized to the test data. In addition, we take the dimension reduction on the HSI, which reduces the model size and storage usage without sacrificing fusion accuracy. Furthermore, we design an imaging model-based loss function for CNN, which further boosts the fusion performance. The experimental results show the significantly high efficiency and accuracy of our approach.
Collapse
|
15
|
Qu Q, Pan B, Xu X, Li T, Shi Z. Unmixing Guided Unsupervised Network for RGB Spectral Super-Resolution. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:4856-4867. [PMID: 37527312 DOI: 10.1109/tip.2023.3299197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/03/2023]
Abstract
Spectral super-resolution has attracted research attention recently, which aims to generate hyperspectral images from RGB images. However, most of the existing spectral super-resolution algorithms work in a supervised manner, requiring pairwise data for training, which is difficult to obtain. In this paper, we propose an Unmixing Guided Unsupervised Network (UnGUN), which does not require pairwise imagery to achieve unsupervised spectral super-resolution. In addition, UnGUN utilizes arbitrary other hyperspectral imagery as the guidance image to guide the reconstruction of spectral information. The UnGUN mainly includes three branches: two unmixing branches and a reconstruction branch. Hyperspectral unmixing branch and RGB unmixing branch decompose the guidance and RGB images into corresponding endmembers and abundances respectively, from which the spectral and spatial priors are extracted. Meanwhile, the reconstruction branch integrates the above spectral-spatial priors to generate a coarse hyperspectral image and then refined it. Besides, we design a discriminator to ensure that the distribution of generated image is close to the guidance hyperspectral imagery, so that the reconstructed image follows the characteristics of a real hyperspectral image. The major contribution is that we develop an unsupervised framework based on spectral unmixing, which realizes spectral super-resolution without paired hyperspectral-RGB images. Experiments demonstrate the superiority of UnGUN when compared with some SOTA methods.
Collapse
|
16
|
Ran R, Deng LJ, Jiang TX, Hu JF, Chanussot J, Vivone G. GuidedNet: A General CNN Fusion Framework via High-Resolution Guidance for Hyperspectral Image Super-Resolution. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:4148-4161. [PMID: 37022388 DOI: 10.1109/tcyb.2023.3238200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Hyperspectral image super-resolution (HISR) is about fusing a low-resolution hyperspectral image (LR-HSI) and a high-resolution multispectral image (HR-MSI) to generate a high-resolution hyperspectral image (HR-HSI). Recently, convolutional neural network (CNN)-based techniques have been extensively investigated for HISR yielding competitive outcomes. However, existing CNN-based methods often require a huge amount of network parameters leading to a heavy computational burden, thus, limiting the generalization ability. In this article, we fully consider the characteristic of the HISR, proposing a general CNN fusion framework with high-resolution guidance, called GuidedNet. This framework consists of two branches, including 1) the high-resolution guidance branch (HGB) that can decompose the high-resolution guidance image into several scales and 2) the feature reconstruction branch (FRB) that takes the low-resolution image and the multiscaled high-resolution guidance images from the HGB to reconstruct the high-resolution fused image. GuidedNet can effectively predict the high-resolution residual details that are added to the upsampled HSI to simultaneously improve spatial quality and preserve spectral information. The proposed framework is implemented using recursive and progressive strategies, which can promote high performance with a significant network parameter reduction, even ensuring network stability by supervising several intermediate outputs. Additionally, the proposed approach is also suitable for other resolution enhancement tasks, such as remote sensing pansharpening and single-image super-resolution (SISR). Extensive experiments on simulated and real datasets demonstrate that the proposed framework generates state-of-the-art outcomes for several applications (i.e., HISR, pansharpening, and SISR). Finally, an ablation study and more discussions assessing, for example, the network generalization, the low computational cost, and the fewer network parameters, are provided to the readers. The code link is: https://github.com/Evangelion09/GuidedNet.
Collapse
|
17
|
Wu J, Xia Y, Wang X, Wei Y, Liu A, Innanje A, Zheng M, Chen L, Shi J, Wang L, Zhan Y, Zhou XS, Xue Z, Shi F, Shen D. uRP: An integrated research platform for one-stop analysis of medical images. FRONTIERS IN RADIOLOGY 2023; 3:1153784. [PMID: 37492386 PMCID: PMC10365282 DOI: 10.3389/fradi.2023.1153784] [Citation(s) in RCA: 52] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 03/31/2023] [Indexed: 07/27/2023]
Abstract
Introduction Medical image analysis is of tremendous importance in serving clinical diagnosis, treatment planning, as well as prognosis assessment. However, the image analysis process usually involves multiple modality-specific software and relies on rigorous manual operations, which is time-consuming and potentially low reproducible. Methods We present an integrated platform - uAI Research Portal (uRP), to achieve one-stop analyses of multimodal images such as CT, MRI, and PET for clinical research applications. The proposed uRP adopts a modularized architecture to be multifunctional, extensible, and customizable. Results and Discussion The uRP shows 3 advantages, as it 1) spans a wealth of algorithms for image processing including semi-automatic delineation, automatic segmentation, registration, classification, quantitative analysis, and image visualization, to realize a one-stop analytic pipeline, 2) integrates a variety of functional modules, which can be directly applied, combined, or customized for specific application domains, such as brain, pneumonia, and knee joint analyses, 3) enables full-stack analysis of one disease, including diagnosis, treatment planning, and prognosis assessment, as well as full-spectrum coverage for multiple disease applications. With the continuous development and inclusion of advanced algorithms, we expect this platform to largely simplify the clinical scientific research process and promote more and better discoveries.
Collapse
Affiliation(s)
- Jiaojiao Wu
- Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
| | - Yuwei Xia
- Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
| | - Xuechun Wang
- Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
| | - Ying Wei
- Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
| | - Aie Liu
- Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
| | - Arun Innanje
- Department of Research and Development, United Imaging Intelligence Co., Ltd., Cambridge, MA, United States
| | - Meng Zheng
- Department of Research and Development, United Imaging Intelligence Co., Ltd., Cambridge, MA, United States
| | - Lei Chen
- Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
| | - Jing Shi
- Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
| | - Liye Wang
- Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
| | - Yiqiang Zhan
- Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
| | - Xiang Sean Zhou
- Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
| | - Zhong Xue
- Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
| | - Feng Shi
- Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
| | - Dinggang Shen
- Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
- School of Biomedical Engineering, ShanghaiTech University, Shanghai, China
- Shanghai Clinical Research and Trial Center, Shanghai, China
| |
Collapse
|
18
|
HMFT: Hyperspectral and Multispectral Image Fusion Super-Resolution Method Based on Efficient Transformer and Spatial-Spectral Attention Mechanism. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2023; 2023:4725986. [PMID: 36909978 PMCID: PMC9995205 DOI: 10.1155/2023/4725986] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Revised: 02/12/2023] [Accepted: 02/14/2023] [Indexed: 03/05/2023]
Abstract
Due to the imaging mechanism of hyperspectral images, the spatial resolution of the resulting images is low. An effective method to solve this problem is to fuse the low-resolution hyperspectral image (LR-HSI) with the high-resolution multispectral image (HR-MSI) to generate the high-resolution hyperspectral image (HR-HSI). Currently, the state-of-the-art fusion approach is based on convolutional neural networks (CNN), and few have attempted to use Transformer, which shows impressive performance on advanced vision tasks. In this paper, a simple and efficient hybrid architecture network based on Transformer is proposed to solve the hyperspectral image fusion super-resolution problem. We use the clever combination of convolution and Transformer as the backbone network to fully extract spatial-spectral information by taking advantage of the local and global concerns of both. In order to pay more attention to the information features such as high-frequency information conducive to HR-HSI reconstruction and explore the correlation between spectra, the convolutional attention mechanism is used to further refine the extracted features in spatial and spectral dimensions, respectively. In addition, considering that the resolution of HSI is usually large, we use the feature split module (FSM) to replace the self-attention computation method of the native Transformer to reduce the computational complexity and storage scale of the model and greatly improve the efficiency of model training. Many experiments show that the proposed network architecture achieves the best qualitative and quantitative performance compared with the latest HSI super-resolution methods.
Collapse
|
19
|
Li Q, Yuan Y, Jia X, Wang Q. Dual-Stage Approach Toward Hyperspectral Image Super-Resolution. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:7252-7263. [PMID: 36378792 DOI: 10.1109/tip.2022.3221287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Hyperspectral image produces high spectral resolution at the sacrifice of spatial resolution. Without reducing the spectral resolution, improving the resolution in the spatial domain is a very challenging problem. Motivated by the discovery that hyperspectral image exhibits high similarity between adjacent bands in a large spectral range, in this paper, we explore a new structure for hyperspectral image super-resolution (DualSR), leading to a dual-stage design, i.e., coarse stage and fine stage. In coarse stage, five bands with high similarity in a certain spectral range are divided into three groups, and the current band is guided to study the potential knowledge. Under the action of alternative spectral fusion mechanism, the coarse SR image is super-resolved in band-by-band. In order to build model from a global perspective, an enhanced back-projection method via spectral angle constraint is developed in fine stage to learn the content of spatial-spectral consistency, dramatically improving the performance gain. Extensive experiments demonstrate the effectiveness of the proposed coarse stage and fine stage. Besides, our network produces state-of-the-art results against existing works in terms of spatial reconstruction and spectral fidelity. Our code is publicly available at https://github.com/qianngli/DualSR.
Collapse
|
20
|
Spatial and Spectral-Channel Attention Network for Denoising on Hyperspectral Remote Sensing Image. REMOTE SENSING 2022. [DOI: 10.3390/rs14143338] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Hyperspectral images (HSIs) are frequently contaminated by different noises (Gaussian noise, stripe noise, deadline noise, impulse noise) in the acquisition process as a result of the observation environment and imaging system limitations, which makes image information lost and difficult to recover. In this paper, we adopt a 3D-based SSCA block neural network of U-Net architecture for remote sensing HSI denoising, named SSCANet (Spatial and Spectral-Channel Attention Network), which is mainly constructed by a so-called SSCA block. By fully considering the characteristics of spatial-domain and spectral-domain of remote sensing HSIs, the SSCA block consists of a spatial attention (SA) block and a spectral-channel attention (SCA) block, in which the SA block is to extract spatial information and enhance spatial representation ability, as well as the SCA block to explore the band-wise relationship within HSIs for preserving spectral information. Compared to earlier 2D convolution, 3D convolution has a powerful spectrum preservation ability, allowing for improved extraction of HSIs characteristics. Experimental results demonstrate that our method holds better-restored results than other compared approaches, both visually and quantitatively.
Collapse
|
21
|
Abstract
Pansharpening is an important yet challenging remote sensing image processing task, which aims to reconstruct a high-resolution (HR) multispectral (MS) image by fusing a HR panchromatic (PAN) image and a low-resolution (LR) MS image. Though deep learning (DL)-based pansharpening methods have achieved encouraging performance, they are infeasible to fully utilize the deep semantic features and shallow contextual features in the process of feature fusion for a HR-PAN image and LR-MS image. In this paper, we propose an efficient full-depth feature fusion network (FDFNet) for remote sensing pansharpening. Specifically, we design three distinctive branches called PAN-branch, MS-branch, and fusion-branch, respectively. The features extracted from the PAN and MS branches will be progressively injected into the fusion branch at every different depth to make the information fusion more broad and comprehensive. With this structure, the low-level contextual features and high-level semantic features can be characterized and integrated adequately. Extensive experiments on reduced- and full-resolution datasets acquired from WorldView-3, QuickBird, and GaoFen-2 sensors demonstrate that the proposed FDFNet only with less than 100,000 parameters performs better than other detail injection-based proposals and several state-of-the-art approaches, both visually and quantitatively.
Collapse
|
22
|
An Improved Version of the Generalized Laplacian Pyramid Algorithm for Pansharpening. REMOTE SENSING 2021. [DOI: 10.3390/rs13173386] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The spatial resolution of multispectral data can be synthetically improved by exploiting the spatial content of a companion panchromatic image. This process, named pansharpening, is widely employed by data providers to augment the quality of images made available for many applications. The huge demand requires the utilization of efficient fusion algorithms that do not require specific training phases, but rather exploit physical considerations to combine the available data. For this reason, classical model-based approaches are still widely used in practice. We created and assessed a method for improving a widespread approach, based on the generalized Laplacian pyramid decomposition, by combining two different cost-effective upgrades: the estimation of the detail-extraction filter from data and the utilization of an improved injection scheme based on multilinear regression. The proposed method was compared with several existing efficient pansharpening algorithms, employing the most credited performance evaluation protocols. The capability of achieving optimal results in very different scenarios was demonstrated by employing data acquired by the IKONOS and WorldView-3 satellites.
Collapse
|