1
|
Liu Y, Jiang K, Xie W, Zhang J, Li Y, Fang L. Hyperspectral anomaly detection with self-supervised anomaly prior. Neural Netw 2025; 187:107294. [PMID: 40020355 DOI: 10.1016/j.neunet.2025.107294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2024] [Revised: 10/31/2024] [Accepted: 02/15/2025] [Indexed: 03/03/2025]
Abstract
Hyperspectral anomaly detection (HAD) can identify and locate the targets without any known information and is widely applied in Earth observation and military fields. The majority of existing HAD methods use the low-rank representation (LRR) model to separate the background and anomaly through mathematical optimization, in which the anomaly is optimized with a handcrafted sparse prior (e.g., ℓ2,1-norm). However, this may not be ideal since they overlook the spatial structure present in anomalies and make the detection result largely dependent on manually set sparsity. To tackle these problems, we redefine the optimization criterion for the anomaly in the LRR model with a self-supervised network called self-supervised anomaly prior (SAP). This prior is obtained by the pretext task of self-supervised learning, which is customized to learn the characteristics of hyperspectral anomalies. Specifically, this pretext task is a classification task to distinguish the original hyperspectral image (HSI) and the pseudo-anomaly HSI, where the pseudo-anomaly is generated from the original HSI and designed as a prism with arbitrary polygon bases and arbitrary spectral bands. In addition, a dual-purified strategy is proposed to provide a more refined background representation with an enriched background dictionary, facilitating the separation of anomalies from complex backgrounds. Extensive experiments on various hyperspectral datasets demonstrate that the proposed SAP offers a more accurate and interpretable solution than other advanced HAD methods.
Collapse
Affiliation(s)
- Yidan Liu
- College of Electrical and Information Engineering, Hunan University, Changsha, 410082, China.
| | - Kai Jiang
- State Key Laboratory of Integrated Services Networks, Xidian University, Xi'an, 710071, China.
| | - Weiying Xie
- State Key Laboratory of Integrated Services Networks, Xidian University, Xi'an, 710071, China.
| | - Jiaqing Zhang
- State Key Laboratory of Integrated Services Networks, Xidian University, Xi'an, 710071, China.
| | - Yunsong Li
- State Key Laboratory of Integrated Services Networks, Xidian University, Xi'an, 710071, China.
| | - Leyuan Fang
- College of Electrical and Information Engineering, Hunan University, Changsha, 410082, China.
| |
Collapse
|
2
|
Sun J, Chen B, Lu R, Cheng Z, Qu C, Yuan X. Advancing Hyperspectral and Multispectral Image Fusion: An Information-Aware Transformer-Based Unfolding Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:7407-7421. [PMID: 38776209 DOI: 10.1109/tnnls.2024.3400809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2024]
Abstract
In hyperspectral image (HSI) processing, the fusion of the high-resolution multispectral image (HR-MSI) and the low-resolution HSI (LR-HSI) on the same scene, known as MSI-HSI fusion, is a crucial step in obtaining the desired high-resolution HSI (HR-HSI). With the powerful representation ability, convolutional neural network (CNN)-based deep unfolding methods have demonstrated promising performances. However, limited receptive fields of CNN often lead to inaccurate long-range spatial features, and inherent input and output images for each stage in unfolding networks restrict the feature transmission, thus limiting the overall performance. To this end, we propose a novel and efficient information-aware transformer-based unfolding network (ITU-Net) to model the long-range dependencies and transfer more information across the stages. Specifically, we employ a customized transformer block to learn representations from both the spatial and frequency domains as well as avoid the quadratic complexity with respect to the input length. For spatial feature extractions, we develop an information transfer guided linearized attention (ITLA), which transmits high-throughput information between adjacent stages and extracts contextual features along the spatial dimension in linear complexity. Moreover, we introduce frequency domain learning in the feedforward network (FFN) to capture token variations of the image and narrow the frequency gap. Via integrating our proposed transformer blocks with the unfolding framework, our ITU-Net achieves state-of-the-art (SOTA) performance on both synthetic and real hyperspectral datasets.
Collapse
|
3
|
Thomas JB, Lapray PJ, Le Moan S. Trends in Snapshot Spectral Imaging: Systems, Processing, and Quality. SENSORS (BASEL, SWITZERLAND) 2025; 25:675. [PMID: 39943313 PMCID: PMC11820509 DOI: 10.3390/s25030675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2024] [Revised: 01/18/2025] [Accepted: 01/21/2025] [Indexed: 02/16/2025]
Abstract
Recent advances in spectral imaging have enabled snapshot acquisition, as a means to mitigate the impracticalities of spectral imaging, e.g., expert operators and cumbersome hardware. Snapshot spectral imaging, e.g., in technologies like spectral filter arrays, has also enabled higher temporal resolution at the expense of the spatio-spectral resolution, allowing for the observation of temporal events. Designing, realising, and deploying such technologies is yet challenging, particularly due to the lack of clear, user-meaningful quality criteria across diverse applications, sensor types, and workflows. Key research gaps include optimising raw image processing from snapshot spectral imagers and assessing spectral image and video quality in ways valuable to end-users, manufacturers, and developers. This paper identifies several challenges and current opportunities. It proposes considering them jointly and suggests creating a new unified snapshot spectral imaging paradigm that would combine new systems and standards, new algorithms, new cost functions, and quality indices.
Collapse
Affiliation(s)
- Jean-Baptiste Thomas
- Imagerie et Vision Artificielle (ImViA) Laboratory, Department Informatique, Electronique, Mécanique (IEM), Université de Bourgogne Europe, 21000 Dijon, France
- Department of Computer Science, NTNU—Norwegian University of Science and Technology, 2815 Gjøvik, Norway;
| | - Pierre-Jean Lapray
- The Institute for Research in Computer Science, Mathematics, Automation and Signal, Université de Haute-Alsace, IRIMAS UR 7499, 68100 Mulhouse, France;
| | - Steven Le Moan
- Department of Computer Science, NTNU—Norwegian University of Science and Technology, 2815 Gjøvik, Norway;
| |
Collapse
|
4
|
Liu J, Li S, Liu H, Dian R, Wei X. A Lightweight Pixel-Level Unified Image Fusion Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:18120-18132. [PMID: 37819819 DOI: 10.1109/tnnls.2023.3311820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]
Abstract
In recent years, deep-learning-based pixel-level unified image fusion methods have received more and more attention due to their practicality and robustness. However, they usually require a complex network to achieve more effective fusion, leading to high computational cost. To achieve more efficient and accurate image fusion, a lightweight pixel-level unified image fusion (L-PUIF) network is proposed. Specifically, the information refinement and measurement process are used to extract the gradient and intensity information and enhance the feature extraction capability of the network. In addition, these information are converted into weights to guide the loss function adaptively. Thus, more effective image fusion can be achieved while ensuring the lightweight of the network. Extensive experiments have been conducted on four public image fusion datasets across multimodal fusion, multifocus fusion, and multiexposure fusion. Experimental results show that L-PUIF can achieve better fusion efficiency and has a greater visual effect compared with state-of-the-art methods. In addition, the practicability of L-PUIF in high-level computer vision tasks, i.e., object detection and image segmentation, has been verified.
Collapse
|
5
|
Liu H, Feng C, Dian R, Li S. SSTF-Unet: Spatial-Spectral Transformer-Based U-Net for High-Resolution Hyperspectral Image Acquisition. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:18222-18236. [PMID: 37738195 DOI: 10.1109/tnnls.2023.3313202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/24/2023]
Abstract
To obtain a high-resolution hyperspectral image (HR-HSI), fusing a low-resolution hyperspectral image (LR-HSI) and a high-resolution multispectral image (HR-MSI) is a prominent approach. Numerous approaches based on convolutional neural networks (CNNs) have been presented for hyperspectral image (HSI) and multispectral image (MSI) fusion. Nevertheless, these CNN-based methods may ignore the global relevant features from the input image due to the geometric limitations of convolutional kernels. To obtain more accurate fusion results, we provide a spatial-spectral transformer-based U-net (SSTF-Unet). Our SSTF-Unet can capture the association between distant features and explore the intrinsic information of images. More specifically, we use the spatial transformer block (SATB) and spectral transformer block (SETB) to calculate the spatial and spectral self-attention, respectively. Then, SATB and SETB are connected in parallel to form the spatial-spectral fusion block (SSFB). Inspired by the U-net architecture, we build up our SSTF-Unet through stacking several SSFBs for multiscale spatial-spectral feature fusion. Experimental results on public HSI datasets demonstrate that the designed SSTF-Unet achieves better performance than other existing HSI and MSI fusion approaches.
Collapse
|
6
|
Wang J, Zhang M, Li W, Tao R. A Multistage Information Complementary Fusion Network Based on Flexible-Mixup for HSI-X Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:17189-17201. [PMID: 37578909 DOI: 10.1109/tnnls.2023.3300903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/16/2023]
Abstract
Mixup-based data augmentation has been proven to be beneficial to the regularization of models during training, especially in the remote-sensing field where the training data is scarce. However, in the process of data augmentation, the Mixup-based methods ignore the target proportion in different inputs and keep the linear insertion ratio consistent, which leads to the response of label space even if no effective objects are introduced in the mixed image due to the randomness of the augmentation process. Moreover, although some previous works have attempted to utilize different multimodal interaction strategies, they could not be well extended to various remote-sensing data combinations. To this end, a multistage information complementary fusion network based on flexible-mixup (Flex-MCFNet) is proposed for hyperspectral-X image classification. First, to bridge the gap between the mixed image and the label, a flexible-mixup (FlexMix) data augmentation strategy is designed, where the weight of the label increases with the ratio of the input image to prevent the negative impact on the label space because of the introduction of invalid information. More importantly, to summarize diverse remote-sensing data inputs including various modal supplements and uncertainties, a multistage information complementary fusion network (MCFNet) is developed. After extracting the features of hyperspectral and complementary modalities [X-modal, including multispectral, synthetic aperture radar (SAR), and light detection and ranging (LiDAR)] separately, the information between complementary modalities is fully interacted and enhanced through multiple stages of information complement and fusion, which is used for the final image classification. Extensive experimental results have demonstrated that Flex-MCFNet can not only effectively expand the training data, but also adequately regularize different data combinations to achieve state-of-the-art performance.
Collapse
|
7
|
Li J, Ma J, Omisore OM, Liu Y, Tang H, Ao P, Yan Y, Wang L, Nie Z. Noninvasive Blood Glucose Monitoring Using Spatiotemporal ECG and PPG Feature Fusion and Weight-Based Choquet Integral Multimodel Approach. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:14491-14505. [PMID: 37289613 DOI: 10.1109/tnnls.2023.3279383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
change of blood glucose (BG) level stimulates the autonomic nervous system leading to variation in both human's electrocardiogram (ECG) and photoplethysmogram (PPG). In this article, we aimed to construct a novel multimodal framework based on ECG and PPG signal fusion to establish a universal BG monitoring model. This is proposed as a spatiotemporal decision fusion strategy that uses weight-based Choquet integral for BG monitoring. Specifically, the multimodal framework performs three-level fusion. First, ECG and PPG signals are collected and coupled into different pools. Second, the temporal statistical features and spatial morphological features in the ECG and PPG signals are extracted through numerical analysis and residual networks, respectively. Furthermore, the suitable temporal statistical features are determined with three feature selection techniques, and the spatial morphological features are compressed by deep neural networks (DNNs). Lastly, weight-based Choquet integral multimodel fusion is integrated for coupling different BG monitoring algorithms based on the temporal statistical features and spatial morphological features. To verify the feasibility of the model, a total of 103 days of ECG and PPG signals encompassing 21 participants were collected in this article. The BG levels of participants ranged between 2.2 and 21.8 mmol/L. The results obtained show that the proposed model has excellent BG monitoring performance with a root-mean-square error (RMSE) of 1.49 mmol/L, mean absolute relative difference (MARD) of 13.42%, and Zone A + B of 99.49% in tenfold cross-validation. Therefore, we conclude that the proposed fusion approach for BG monitoring has potentials in practical applications of diabetes management.
Collapse
|
8
|
Zhao C, Yang P, Zhou F, Yue G, Wang S, Wu H, Chen G, Wang T, Lei B. MHW-GAN: Multidiscriminator Hierarchical Wavelet Generative Adversarial Network for Multimodal Image Fusion. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:13713-13727. [PMID: 37432812 DOI: 10.1109/tnnls.2023.3271059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/13/2023]
Abstract
Image fusion technology aims to obtain a comprehensive image containing a specific target or detailed information by fusing data of different modalities. However, many deep learning-based algorithms consider edge texture information through loss functions instead of specifically constructing network modules. The influence of the middle layer features is ignored, which leads to the loss of detailed information between layers. In this article, we propose a multidiscriminator hierarchical wavelet generative adversarial network (MHW-GAN) for multimodal image fusion. First, we construct a hierarchical wavelet fusion (HWF) module as the generator of MHW-GAN to fuse feature information at different levels and scales, which avoids information loss in the middle layers of different modalities. Second, we design an edge perception module (EPM) to integrate edge information from different modalities to avoid the loss of edge information. Third, we leverage the adversarial learning relationship between the generator and three discriminators for constraining the generation of fusion images. The generator aims to generate a fusion image to fool the three discriminators, while the three discriminators aim to distinguish the fusion image and edge fusion image from two source images and the joint edge image, respectively. The final fusion image contains both intensity information and structure information via adversarial learning. Experiments on public and self-collected four types of multimodal image datasets show that the proposed algorithm is superior to the previous algorithms in terms of both subjective and objective evaluation.
Collapse
|
9
|
Zhang Q, Zheng Y, Yuan Q, Song M, Yu H, Xiao Y. Hyperspectral Image Denoising: From Model-Driven, Data-Driven, to Model-Data-Driven. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:13143-13163. [PMID: 37279128 DOI: 10.1109/tnnls.2023.3278866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Mixed noise pollution in HSI severely disturbs subsequent interpretations and applications. In this technical review, we first give the noise analysis in different noisy HSIs and conclude crucial points for programming HSI denoising algorithms. Then, a general HSI restoration model is formulated for optimization. Later, we comprehensively review existing HSI denoising methods, from model-driven strategy (nonlocal mean, total variation, sparse representation, low-rank matrix approximation, and low-rank tensor factorization), data-driven strategy [2-D convolutional neural network (CNN), 3-D CNN, hybrid, and unsupervised networks], to model-data-driven strategy. The advantages and disadvantages of each strategy for HSI denoising are summarized and contrasted. Behind this, we present an evaluation of the HSI denoising methods for various noisy HSIs in simulated and real experiments. The classification results of denoised HSIs and execution efficiency are depicted through these HSI denoising methods. Finally, prospects of future HSI denoising methods are listed in this technical review to guide the ongoing road for HSI denoising. The HSI denoising dataset could be found at https://qzhang95.github.io.
Collapse
|
10
|
Xing C, Zhao J, Wang Z, Wang M. Deep Ring-Block-Wise Network for Hyperspectral Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:14125-14137. [PMID: 37220048 DOI: 10.1109/tnnls.2023.3274745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Deep learning has achieved many successes in the field of the hyperspectral image (HSI) classification. Most of existing deep learning-based methods have no consideration of feature distribution, which may yield lowly separable and discriminative features. From the perspective of spatial geometry, one excellent feature distribution form requires to satisfy both properties, i.e., block and ring. The block means that in a feature space, the distance of intraclass samples is close and the one of interclass samples is far. The ring represents that all class samples are overall distributed in a ring topology. Accordingly, in this article, we propose a novel deep ring-block-wise network (DRN) for the HSI classification, which takes full consideration of feature distribution. To obtain the good distribution used for high classification performance, in this DRN, a ring-block perception (RBP) layer is built by integrating the self-representation and ring loss into a perception model. By such way, the exported features are imposed to follow the requirements of both block and ring, so as to be more separably and discriminatively distributed compared with traditional deep networks. Besides, we also design an optimization strategy with alternating update to obtain the solution of this RBP layer model. Extensive results on the Salinas, Pavia Centre, Indian Pines, and Houston datasets have demonstrated that the proposed DRN method achieves the better classification performance in contrast to the state-of-the-art approaches.
Collapse
|
11
|
Yang J, Xiao L, Zhao YQ, Chan JCW. Unsupervised Deep Tensor Network for Hyperspectral-Multispectral Image Fusion. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:13017-13031. [PMID: 37134042 DOI: 10.1109/tnnls.2023.3266038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Fusing low-resolution (LR) hyperspectral images (HSIs) with high-resolution (HR) multispectral images (MSIs) is a significant technology to enhance the resolution of HSIs. Despite the encouraging results from deep learning (DL) in HSI-MSI fusion, there are still some issues. First, the HSI is a multidimensional signal, and the representability of current DL networks for multidimensional features has not been thoroughly investigated. Second, most DL HSI-MSI fusion networks need HR HSI ground truth for training, but it is often unavailable in reality. In this study, we integrate tensor theory with DL and propose an unsupervised deep tensor network (UDTN) for HSI-MSI fusion. We first propose a tensor filtering layer prototype and further build a coupled tensor filtering module. It jointly represents the LR HSI and HR MSI as several features revealing the principal components of spectral and spatial modes and a sharing code tensor describing the interaction among different modes. Specifically, the features on different modes are represented by the learnable filters of tensor filtering layers, the sharing code tensor is learned by a projection module, in which a co-attention is proposed to encode the LR HSI and HR MSI and then project them onto the sharing code tensor. The coupled tensor filtering module and projection module are jointly trained from the LR HSI and HR MSI in an unsupervised and end-to-end way. The latent HR HSI is inferred with the sharing code tensor, the features on spatial modes of HR MSIs, and the spectral mode of LR HSIs. Experiments on simulated and real remote-sensing datasets demonstrate the effectiveness of the proposed method.
Collapse
|
12
|
Zhao F, Zhao W, Lu H. Interactive Feature Embedding for Infrared and Visible Image Fusion. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:12810-12822. [PMID: 37040245 DOI: 10.1109/tnnls.2023.3264911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
General deep learning-based methods for infrared and visible image fusion rely on the unsupervised mechanism for vital information retention by utilizing elaborately designed loss functions. However, the unsupervised mechanism depends on a well-designed loss function, which cannot guarantee that all vital information of source images is sufficiently extracted. In this work, we propose a novel interactive feature embedding in a self-supervised learning framework for infrared and visible image fusion, attempting to overcome the issue of vital information degradation. With the help of a self-supervised learning framework, hierarchical representations of source images can be efficiently extracted. In particular, interactive feature embedding models are tactfully designed to build a bridge between self-supervised learning and infrared and visible image fusion learning, achieving vital information retention. Qualitative and quantitative evaluations exhibit that the proposed method performs favorably against state-of-the-art methods.
Collapse
|
13
|
Ma W, Ma M, Jiao L, Liu F, Zhu H, Liu X, Yang S, Hou B. An Adaptive Migration Collaborative Network for Multimodal Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:10935-10949. [PMID: 37027624 DOI: 10.1109/tnnls.2023.3245643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
The multispectral (MS) and the panchromatic (PAN) images belong to different modalities with specific advantageous properties. Therefore, there is a large representation gap between them. Moreover, the features extracted independently by the two branches belong to different feature spaces, which is not conducive to the subsequent collaborative classification. At the same time, different layers also have different representation capabilities for objects with large size differences. In order to dynamically and adaptively transfer the dominant attributes, reduce the gap between them, find the best shared layer representation, and fuse the features of different representation capabilities, this article proposes an adaptive migration collaborative network (AMC-Net) for multimodal remote-sensing (RS) images classification. First, for the input of the network, we combine principal component analysis (PCA) and nonsubsampled contourlet transformation (NSCT) to migrate the advantageous attributes of the PAN and the MS images to each other. This not only improves the quality of images themselves, but also increases the similarity between the two images, thereby reducing the representational gap between them and the pressure on the subsequent classification network. Second, for the interaction on the feature migrate branch, we design a feature progressive migration fusion unit (FPMF-Unit) based on the adaptive cross-stitch unit of correlation coefficient analysis (CCA), which can make the network automatically learn the features that need to be shared and migrated, aiming to find the best shared-layer representation for multifeature learning. And we design an adaptive layer fusion mechanism module (ALFM-Module), which can adaptively fuse features of different layers, aiming to clearly model the dependencies among multiple layers for different sized objects. Finally, for the output of the network, we add the calculation of the correlation coefficient to the loss function, which can make the network converge to the global optimum as much as possible. The experimental results indicate that AMC-Net can achieve competitive performance. And the code for the network framework is available at: https://github.com/ru-willow/A-AFM-ResNet.
Collapse
|
14
|
Xu G, He C, Wang H, Zhu H, Ding W. DM-Fusion: Deep Model-Driven Network for Heterogeneous Image Fusion. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:10071-10085. [PMID: 37022081 DOI: 10.1109/tnnls.2023.3238511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Heterogeneous image fusion (HIF) is an enhancement technique for highlighting the discriminative information and textural detail from heterogeneous source images. Although various deep neural network-based HIF methods have been proposed, the most widely used single data-driven manner of the convolutional neural network always fails to give a guaranteed theoretical architecture and optimal convergence for the HIF problem. In this article, a deep model-driven neural network is designed for this HIF problem, which adaptively integrates the merits of model-based techniques for interpretability and deep learning-based methods for generalizability. Unlike the general network architecture as a black box, the proposed objective function is tailored to several domain knowledge network modules to model the compact and explainable deep model-driven HIF network termed DM-fusion. The proposed deep model-driven neural network shows the feasibility and effectiveness of three parts, the specific HIF model, an iterative parameter learning scheme, and data-driven network architecture. Furthermore, the task-driven loss function strategy is proposed to achieve feature enhancement and preservation. Numerous experiments on four fusion tasks and downstream applications illustrate the advancement of DM-fusion compared with the state-of-the-art (SOTA) methods both in fusion quality and efficiency. The source code will be available soon.
Collapse
|
15
|
Dian R, Shan T, He W, Liu H. Spectral Super-Resolution via Model-Guided Cross-Fusion Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:10059-10070. [PMID: 37022225 DOI: 10.1109/tnnls.2023.3238506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Spectral super-resolution, which reconstructs a hyperspectral image (HSI) from a single red-green-blue (RGB) image, has acquired more and more attention. Recently, convolution neural networks (CNNs) have achieved promising performance. However, they often fail to simultaneously exploit the imaging model of the spectral super-resolution and complex spatial and spectral characteristics of the HSI. To tackle the above problems, we build a novel cross fusion (CF)-based model-guided network (called SSRNet) for spectral super-resolution. In specific, based on the imaging model, we unfold the spectral super-resolution into the HSI prior learning (HPL) module and imaging model guiding (IMG) module. Instead of just modeling one kind of image prior, the HPL module is composed of two subnetworks with different structures, which can effectively learn the complex spatial and spectral priors of the HSI, respectively. Furthermore, a CF strategy is used to establish the connection between the two subnetworks, which further improves the learning performance of the CNN. The IMG module results in solving a strong convex optimization problem, which adaptively optimizes and merges the two features learned by the HPL module by exploiting the imaging model. The two modules are alternately connected to achieve optimal HSI reconstruction performance. Experiments on both the simulated and real data demonstrate that the proposed method can achieve superior spectral reconstruction results with relatively small model size. The code will be available at https://github.com/renweidian.
Collapse
|
16
|
Zheng Z, Zhang S, Song H, Yan Q. Deep clustering using 3D attention convolutional autoencoder for hyperspectral image analysis. Sci Rep 2024; 14:4209. [PMID: 38378840 PMCID: PMC10879088 DOI: 10.1038/s41598-024-54547-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 02/14/2024] [Indexed: 02/22/2024] Open
Abstract
Deep clustering has been widely applicated in various fields, including natural image and language processing. However, when it is applied to hyperspectral image (HSI) processing, it encounters challenges due to high dimensionality of HSI and complex spatial-spectral characteristics. This study introduces a kind of deep clustering model specifically tailed for HSI analysis. To address the high dimensionality issue, redundant dimension of HSI is firstly eliminated by combining principal component analysis (PCA) with t-distributed stochastic neighbor embedding (t-SNE). The reduced dataset is then input into a three-dimensional attention convolutional autoencoder (3D-ACAE) to extract essential spatial-spectral features. The 3D-ACAE uses spatial-spectral attention mechanism to enhance captured features. Finally, these enhanced features pass through an embedding layer to create a compact data-representation, and the compact data-representation is divided into distinct clusters by clustering layer. Experimental results on three publicly available datasets validate the superiority of the proposed model for HSI analysis.
Collapse
Affiliation(s)
- Ziyou Zheng
- College of Communication and Electronic Engineering, Jishou University, People's South Road, Jishou, 416000, Hunan, China
| | - Shuzhen Zhang
- College of Communication and Electronic Engineering, Jishou University, People's South Road, Jishou, 416000, Hunan, China.
- Key Laboratory of Visual Perception and Artificial Intelligence, Hunan University, Lushan Road, Changsha, 410000, Hunan, China.
| | - Hailong Song
- College of Communication and Electronic Engineering, Jishou University, People's South Road, Jishou, 416000, Hunan, China
| | - Qi Yan
- College of Communication and Electronic Engineering, Jishou University, People's South Road, Jishou, 416000, Hunan, China
| |
Collapse
|
17
|
Zhang Y, Li W, Zhang M, Wang S, Tao R, Du Q. Graph Information Aggregation Cross-Domain Few-Shot Learning for Hyperspectral Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:1912-1925. [PMID: 35771788 DOI: 10.1109/tnnls.2022.3185795] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Most domain adaptation (DA) methods in cross-scene hyperspectral image classification focus on cases where source data (SD) and target data (TD) with the same classes are obtained by the same sensor. However, the classification performance is significantly reduced when there are new classes in TD. In addition, domain alignment, as one of the main approaches in DA, is carried out based on local spatial information, rarely taking into account nonlocal spatial information (nonlocal relationships) with strong correspondence. A graph information aggregation cross-domain few-shot learning (Gia-CFSL) framework is proposed, intending to make up for the above-mentioned shortcomings by combining FSL with domain alignment based on graph information aggregation. SD with all label samples and TD with a few label samples are implemented for FSL episodic training. Meanwhile, intradomain distribution extraction block (IDE-block) and cross-domain similarity aware block (CSA-block) are designed. The IDE-block is used to characterize and aggregate the intradomain nonlocal relationships and the interdomain feature and distribution similarities are captured in the CSA-block. Furthermore, feature-level and distribution-level cross-domain graph alignments are used to mitigate the impact of domain shift on FSL. Experimental results on three public HSI datasets demonstrate the superiority of the proposed method. The codes will be available from the website: https://github.com/YuxiangZhang-BIT/IEEE_TNNLS_Gia-CFSL.
Collapse
|
18
|
Qu J, Dong W, Li Y, Hou S, Du Q. An Interpretable Unsupervised Unrolling Network for Hyperspectral Pansharpening. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:7943-7956. [PMID: 37027771 DOI: 10.1109/tcyb.2023.3241165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Existing deep convolutional neural networks (CNNs) have recently achieved great success in pansharpening. However, most deep CNN-based pansharpening models are based on "black-box" architecture and require supervision, making these methods rely heavily on the ground-truth data and lose their interpretability for specific problems during network training. This study proposes a novel interpretable unsupervised end-to-end pansharpening network, called as IU2PNet, which explicitly encodes the well-studied pansharpening observation model into an unsupervised unrolling iterative adversarial network. Specifically, we first design a pansharpening model, whose iterative process can be computed by the half-quadratic splitting algorithm. Then, the iterative steps are unfolded into a deep interpretable iterative generative dual adversarial network (iGDANet). Generator in iGDANet is interwoven by multiple deep feature pyramid denoising modules and deep interpretable convolutional reconstruction modules. In each iteration, the generator establishes an adversarial game with the spatial and spectral discriminators to update both spectral and spatial information without ground-truth images. Extensive experiments show that, compared with the state-of-the-art methods, our proposed IU2PNet exhibits very competitive performance in terms of quantitative evaluation metrics and qualitative visual effects.
Collapse
|
19
|
Ye F, Wu Z, Jia X, Chanussot J, Xu Y, Wei Z. Bayesian Nonlocal Patch Tensor Factorization for Hyperspectral Image Super-Resolution. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:5877-5892. [PMID: 37889806 DOI: 10.1109/tip.2023.3326687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/29/2023]
Abstract
The synthesis of high-resolution (HR) hyperspectral image (HSI) by fusing a low-resolution HSI with a corresponding HR multispectral image has emerged as a prevalent HSI super-resolution (HSR) scheme. Recent researches have revealed that tensor analysis is an emerging tool for HSR. However, most off-the-shelf tensor-based HSR algorithms tend to encounter challenges in rank determination and modeling capacity. To address these issues, we construct nonlocal patch tensors (NPTs) and characterize low-rank structures with coupled Bayesian tensor factorization. It is worth emphasizing that the intrinsic global spectral correlation and nonlocal spatial similarity can be simultaneously explored under the proposed model. Moreover, benefiting from the technique of automatic relevance determination, we propose a hierarchical probabilistic framework based on Canonical Polyadic (CP) factorization, which incorporates a sparsity-inducing prior over the underlying factor matrices. We further develop an effective expectation-maximization-type optimization scheme for framework estimation. In contrast to existing works, the proposed model can infer the latent CP rank of NPT adaptively without tuning parameters. Extensive experiments on synthesized and real datasets illustrate the intrinsic capability of our model in rank determination as well as its superiority in fusion performance.
Collapse
|
20
|
Zhao M, Dobigeon N, Chen J. Guided Deep Generative Model-Based Spatial Regularization for Multiband Imaging Inverse Problems. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:5692-5704. [PMID: 37812540 DOI: 10.1109/tip.2023.3321460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/11/2023]
Abstract
When adopting a model-based formulation, solving inverse problems encountered in multiband imaging requires to define spatial and spectral regularizations. In most of the works of the literature, spectral information is extracted from the observations directly to derive data-driven spectral priors. Conversely, the choice of the spatial regularization often boils down to the use of conventional penalizations (e.g., total variation) promoting expected features of the reconstructed image (e.g., piece-wise constant). In this work, we propose a generic framework able to capitalize on an auxiliary acquisition of high spatial resolution to derive tailored data-driven spatial regularizations. This approach leverages on the ability of deep learning to extract high level features. More precisely, the regularization is conceived as a deep generative network able to encode spatial semantic features contained in this auxiliary image of high spatial resolution. To illustrate the versatility of this approach, it is instantiated to conduct two particular tasks, namely multiband image fusion and multiband image inpainting. Experimental results obtained on these two tasks demonstrate the benefit of this class of informed regularizations when compared to more conventional ones.
Collapse
|
21
|
Dian R, Guo A, Li S. Zero-Shot Hyperspectral Sharpening. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:12650-12666. [PMID: 37235456 DOI: 10.1109/tpami.2023.3279050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Fusing hyperspectral images (HSIs) with multispectral images (MSIs) of higher spatial resolution has become an effective way to sharpen HSIs. Recently, deep convolutional neural networks (CNNs) have achieved promising fusion performance. However, these methods often suffer from the lack of training data and limited generalization ability. To address the above problems, we present a zero-shot learning (ZSL) method for HSI sharpening. Specifically, we first propose a novel method to quantitatively estimate the spectral and spatial responses of imaging sensors with high accuracy. In the training procedure, we spatially subsample the MSI and HSI based on the estimated spatial response and use the downsampled HSI and MSI to infer the original HSI. In this way, we can not only exploit the inherent information in the HSI and MSI, but the trained CNN can also be well generalized to the test data. In addition, we take the dimension reduction on the HSI, which reduces the model size and storage usage without sacrificing fusion accuracy. Furthermore, we design an imaging model-based loss function for CNN, which further boosts the fusion performance. The experimental results show the significantly high efficiency and accuracy of our approach.
Collapse
|
22
|
Li W, Gao Y, Zhang M, Tao R, Du Q. Asymmetric Feature Fusion Network for Hyperspectral and SAR Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:8057-8070. [PMID: 35180093 DOI: 10.1109/tnnls.2022.3149394] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Joint classification using multisource remote sensing data for Earth observation is promising but challenging. Due to the gap of imaging mechanism and imbalanced information between multisource data, integrating the complementary merits for interpretation is still full of difficulties. In this article, a classification method based on asymmetric feature fusion, named asymmetric feature fusion network (AsyFFNet), is proposed. First, the weight-share residual blocks are utilized for feature extraction while keeping separate batch normalization (BN) layers. In the training phase, redundancy of the current channel is self-determined by the scaling factors in BN, which is replaced by another channel when the scaling factor is less than a threshold. To eliminate unnecessary channels and improve the generalization, a sparse constraint is imposed on partial scaling factors. Besides, a feature calibration module is designed to exploit the spatial dependence of multisource features, so that the discrimination capability is enhanced. Experimental results on the three datasets demonstrate that the proposed AsyFFNet significantly outperforms other competitive approaches.
Collapse
|
23
|
Wang K, Liao X, Li J, Meng D, Wang Y. Hyperspectral Image Super-Resolution via Knowledge-Driven Deep Unrolling and Transformer Embedded Convolutional Recurrent Neural Network. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:4581-4594. [PMID: 37467098 DOI: 10.1109/tip.2023.3293768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/21/2023]
Abstract
Hyperspectral (HS) imaging has been widely used in various real application problems. However, due to the hardware limitations, the obtained HS images usually have low spatial resolution, which could obviously degrade their performance. Through fusing a low spatial resolution HS image with a high spatial resolution auxiliary image (e.g., multispectral, RGB or panchromatic image), the so-called HS image fusion has underpinned much of recent progress in enhancing the spatial resolution of HS image. Nonetheless, a corresponding well registered auxiliary image cannot always be available in some real situations. To remedy this issue, we propose in this paper a newly single HS image super-resolution method based on a novel knowledge-driven deep unrolling technique. Precisely, we first propose a maximum a posterior based energy model with implicit priors, which can be solved by alternating optimization to determine an elementary iteration mechanism. We then unroll such iteration mechanism with an ingenious Transformer embedded convolutional recurrent neural network in which two structural designs are integrated. That is, the vision Transformer and 3D convolution learn the implicit spatial-spectral priors, and the recurrent hidden connections over iterations model the recurrence of the iterative reconstruction stages. Thus, an effective knowledge-driven, end-to-end and data-dependent HS image super-resolution framework can be successfully attained. Extensive experiments on three HS image datasets demonstrate the superiority of the proposed method over several state-of-the-art HS image super-resolution methods.
Collapse
|
24
|
Cao X, Lian Y, Liu Z, Wu J, Zhang W, Liu J. Hyperspectral image super-resolution via spectral matching and correction. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2023; 40:1635-1643. [PMID: 37707121 DOI: 10.1364/josaa.491595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 07/17/2023] [Indexed: 09/15/2023]
Abstract
Fusing a low-spatial-resolution hyperspectral image (LR-HSI) and a high-spatial-resolution RGB image (HR-RGB) is an important technique for HR-HSI obtainment. In this paper, we propose a dual-illuminance fusion-based super-resolution method consisting of spectral matching and correction. In the spectral matching stage, an LR-HSI patch is first searched for each HR-RGB pixel; with the minimum color difference as a constraint, the matching spectrum is constructed by linear mixing the spectrum in the HSI patch. In the spectral correlation stage, we establish a polynomial model to correct the matched spectrum with the aid of the HR-RGBs illuminated by two illuminances, and the target spectrum is obtained. All pixels in the HR-RGB are traversed by the spectral matching and correction process, and the target HR-HSI is eventually reconstructed. The effectiveness of our method is evaluated on three public datasets and our real-world dataset. Experimental results demonstrate the effectiveness of our method compared with eight fusion methods.
Collapse
|
25
|
Ran R, Deng LJ, Jiang TX, Hu JF, Chanussot J, Vivone G. GuidedNet: A General CNN Fusion Framework via High-Resolution Guidance for Hyperspectral Image Super-Resolution. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:4148-4161. [PMID: 37022388 DOI: 10.1109/tcyb.2023.3238200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Hyperspectral image super-resolution (HISR) is about fusing a low-resolution hyperspectral image (LR-HSI) and a high-resolution multispectral image (HR-MSI) to generate a high-resolution hyperspectral image (HR-HSI). Recently, convolutional neural network (CNN)-based techniques have been extensively investigated for HISR yielding competitive outcomes. However, existing CNN-based methods often require a huge amount of network parameters leading to a heavy computational burden, thus, limiting the generalization ability. In this article, we fully consider the characteristic of the HISR, proposing a general CNN fusion framework with high-resolution guidance, called GuidedNet. This framework consists of two branches, including 1) the high-resolution guidance branch (HGB) that can decompose the high-resolution guidance image into several scales and 2) the feature reconstruction branch (FRB) that takes the low-resolution image and the multiscaled high-resolution guidance images from the HGB to reconstruct the high-resolution fused image. GuidedNet can effectively predict the high-resolution residual details that are added to the upsampled HSI to simultaneously improve spatial quality and preserve spectral information. The proposed framework is implemented using recursive and progressive strategies, which can promote high performance with a significant network parameter reduction, even ensuring network stability by supervising several intermediate outputs. Additionally, the proposed approach is also suitable for other resolution enhancement tasks, such as remote sensing pansharpening and single-image super-resolution (SISR). Extensive experiments on simulated and real datasets demonstrate that the proposed framework generates state-of-the-art outcomes for several applications (i.e., HISR, pansharpening, and SISR). Finally, an ablation study and more discussions assessing, for example, the network generalization, the low computational cost, and the fewer network parameters, are provided to the readers. The code link is: https://github.com/Evangelion09/GuidedNet.
Collapse
|
26
|
Pu H, Yu J, Sun DW, Wei Q, Li Q. Distinguishing pericarpium citri reticulatae of different origins using terahertz time-domain spectroscopy combined with convolutional neural networks. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2023; 299:122771. [PMID: 37244024 DOI: 10.1016/j.saa.2023.122771] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 04/17/2023] [Accepted: 04/19/2023] [Indexed: 05/29/2023]
Abstract
The geographical indication of pericarpium citri reticulatae (PCR) is very important in grading the quality and price of PCRs. Therefore, terahertz time-domain spectroscopy (THz-TDS) technology combined with convolutional neural networks (CNN) was proposed to distinguish PCRs of different origins without damage in this study. The one-dimensional CNN (1D-CNN) model with an accuracy of 82.99% based on spectral data processed with SNV was established. The two-dimensional image features were transformed from unprocessed spectral data using the gramian angular field (GAF), the Markov transition field (MTF) and the recurrence plot (RP), which were used to build a two-dimensional CNN (2D-CNN) model with an accuracy of 78.33%. Further, the CNN models with different fusion methods were developed for fusing spectra data and image data. In addition, the adding spectra and images based on the CNN (Add-CNN) model with an accuracy of 86.17% performed better. Eventually, the Add-CNN model based on ten frequencies extracted using permutation importance (PI) achieved the identification of PCRs from different origins. Overall, the current study would provide a new method for identifying PCRs of different origins, which was expected to be used for the traceability of PCRs products.
Collapse
Affiliation(s)
- Hongbin Pu
- School of Food Science and Engineering, South China University of Technology, Guangzhou 510641, China; Academy of Contemporary Food Engineering, South China University of Technology, Guangzhou Higher Education Mega Center, Guangzhou 510006, China; Engineering and Technological Research Centre of Guangdong Province on Intelligent Sensing and Process Control of Cold Chain Foods, & Guangdong Province Engineering Laboratory for Intelligent Cold Chain Logistics Equipment for Agricultural Products, Guangzhou Higher Education Mega Centre, Guangzhou 510006, China
| | - Jingxiao Yu
- School of Food Science and Engineering, South China University of Technology, Guangzhou 510641, China; Academy of Contemporary Food Engineering, South China University of Technology, Guangzhou Higher Education Mega Center, Guangzhou 510006, China; Engineering and Technological Research Centre of Guangdong Province on Intelligent Sensing and Process Control of Cold Chain Foods, & Guangdong Province Engineering Laboratory for Intelligent Cold Chain Logistics Equipment for Agricultural Products, Guangzhou Higher Education Mega Centre, Guangzhou 510006, China
| | - Da-Wen Sun
- School of Food Science and Engineering, South China University of Technology, Guangzhou 510641, China; Academy of Contemporary Food Engineering, South China University of Technology, Guangzhou Higher Education Mega Center, Guangzhou 510006, China; Engineering and Technological Research Centre of Guangdong Province on Intelligent Sensing and Process Control of Cold Chain Foods, & Guangdong Province Engineering Laboratory for Intelligent Cold Chain Logistics Equipment for Agricultural Products, Guangzhou Higher Education Mega Centre, Guangzhou 510006, China; Food Refrigeration and Computerized Food Technology (FRCFT), Agriculture and Food Science Centre, University College Dublin, National University of Ireland, Belfield, Dublin 4, Ireland.
| | - Qingyi Wei
- School of Food Science and Engineering, South China University of Technology, Guangzhou 510641, China; Academy of Contemporary Food Engineering, South China University of Technology, Guangzhou Higher Education Mega Center, Guangzhou 510006, China; Engineering and Technological Research Centre of Guangdong Province on Intelligent Sensing and Process Control of Cold Chain Foods, & Guangdong Province Engineering Laboratory for Intelligent Cold Chain Logistics Equipment for Agricultural Products, Guangzhou Higher Education Mega Centre, Guangzhou 510006, China
| | - Qian Li
- Shenzhen Institute of Terahertz Technology and Innovation, Shenzhen, Guangdong 518102, China
| |
Collapse
|
27
|
Jiang TX, Zhao XL, Zhang H, Ng MK. Dictionary Learning With Low-Rank Coding Coefficients for Tensor Completion. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:932-946. [PMID: 34464263 DOI: 10.1109/tnnls.2021.3104837] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In this article, we propose a novel tensor learning and coding model for third-order data completion. The aim of our model is to learn a data-adaptive dictionary from given observations and determine the coding coefficients of third-order tensor tubes. In the completion process, we minimize the low-rankness of each tensor slice containing the coding coefficients. By comparison with the traditional predefined transform basis, the advantages of the proposed model are that: 1) the dictionary can be learned based on the given data observations so that the basis can be more adaptively and accurately constructed and 2) the low-rankness of the coding coefficients can allow the linear combination of dictionary features more effectively. Also we develop a multiblock proximal alternating minimization algorithm for solving such tensor learning and coding model and show that the sequence generated by the algorithm can globally converge to a critical point. Extensive experimental results for real datasets such as videos, hyperspectral images, and traffic data are reported to demonstrate these advantages and show that the performance of the proposed tensor learning and coding method is significantly better than the other tensor completion methods in terms of several evaluation metrics.
Collapse
|
28
|
Li J, Wu C, Song R, Li Y, Xie W, He L, Gao X. Deep Hybrid 2-D-3-D CNN Based on Dual Second-Order Attention With Camera Spectral Sensitivity Prior for Spectral Super-Resolution. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:623-634. [PMID: 34347604 DOI: 10.1109/tnnls.2021.3098767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
A largely ignored fact in spectral super-resolution (SSR) is that the subsistent mapping methods neglect the auxiliary prior of camera spectral sensitivity (CSS) and only pay attention to wider or deeper network framework design while ignoring to excavate the spatial and spectral dependencies among intermediate layers, hence constraining representational capability of convolutional neural networks (CNNs). To conquer these drawbacks, we propose a novel deep hybrid 2-D-3-D CNN based on dual second-order attention with CSS prior (HSACS), which can excavate sufficient spatial-spectral context information. Specifically, dual second-order attention embedded in the residual block for more powerful spatial-spectral feature representation and relation learning is composed of a brand new trainable 2-D second-order channel attention (SCA) or 3-D second-order band attention (SBA) and a structure tensor attention (STA). Concretely, the band and channel attention modules are developed to adaptively recalibrate the band-wise and interchannel features via employing second-order band or channel feature statistics for more discriminative representations. Besides, the STA is promoted to rebuild the significant high-frequency spatial details for enough spatial feature extraction. Moreover, the CSS is first employed as a superior prior to avoid its effect of SSR quality, on the strength of which the resolved RGB can be calculated naturally through the super-reconstructed hyperspectral image (HSI); then, the final loss consists of the discrepancies of RGB and the HSI as a finer constraint. Experimental results demonstrate the superiority and progressiveness of the presented approach in terms of quantitative metrics and visual effect over SOTA SSR methods.
Collapse
|
29
|
HMFT: Hyperspectral and Multispectral Image Fusion Super-Resolution Method Based on Efficient Transformer and Spatial-Spectral Attention Mechanism. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2023; 2023:4725986. [PMID: 36909978 PMCID: PMC9995205 DOI: 10.1155/2023/4725986] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Revised: 02/12/2023] [Accepted: 02/14/2023] [Indexed: 03/05/2023]
Abstract
Due to the imaging mechanism of hyperspectral images, the spatial resolution of the resulting images is low. An effective method to solve this problem is to fuse the low-resolution hyperspectral image (LR-HSI) with the high-resolution multispectral image (HR-MSI) to generate the high-resolution hyperspectral image (HR-HSI). Currently, the state-of-the-art fusion approach is based on convolutional neural networks (CNN), and few have attempted to use Transformer, which shows impressive performance on advanced vision tasks. In this paper, a simple and efficient hybrid architecture network based on Transformer is proposed to solve the hyperspectral image fusion super-resolution problem. We use the clever combination of convolution and Transformer as the backbone network to fully extract spatial-spectral information by taking advantage of the local and global concerns of both. In order to pay more attention to the information features such as high-frequency information conducive to HR-HSI reconstruction and explore the correlation between spectra, the convolutional attention mechanism is used to further refine the extracted features in spatial and spectral dimensions, respectively. In addition, considering that the resolution of HSI is usually large, we use the feature split module (FSM) to replace the self-attention computation method of the native Transformer to reduce the computational complexity and storage scale of the model and greatly improve the efficiency of model training. Many experiments show that the proposed network architecture achieves the best qualitative and quantitative performance compared with the latest HSI super-resolution methods.
Collapse
|
30
|
Zhang M, Wu Q, Zhang J, Gao X, Guo J, Tao D. Fluid Micelle Network for Image Super-Resolution Reconstruction. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:578-591. [PMID: 35442898 DOI: 10.1109/tcyb.2022.3163294] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Most existing convolutional neural-network-based super-resolution (SR) methods focus on designing effective neural blocks but rarely describe the image SR mechanism from the perspective of image evolution in the SR process. In this study, we explore a new research routine by abstracting the movement of pixels in the reconstruction process as the flow of fluid in the field of fluid dynamics (FD), where explicit motion laws of particles have been discovered. Specifically, a novel fluid micelle network is devised for image SR based on the theory of FD that follows the residual learning scheme but learns the residual structure by solving the finite difference equation in FD. The pixel motion equation in the SR process is derived from the Navier-Stokes (N-S) FD equation, establishing a guided branch that is aware of edge information. Thus, the second-order residual drives the network for feature extraction, and the guided branch corrects the direction of the pixel stream to supplement the details. Experiments on popular benchmarks and a real-world microscope chip image dataset demonstrate that the proposed method outperforms other modern methods in terms of both objective metrics and visual quality. The proposed method can also reconstruct clear geometric structures, offering the potential for real-world applications.
Collapse
|
31
|
Xu H, Jiang J, Feng Y, Jin Y, Zheng J. Tensor completion via hybrid shallow-and-deep priors. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04331-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
32
|
Yang J, Wu C, You T, Wang D, Li Y, Shang C, Shen Q. Hierarchical spatio-spectral fusion for hyperspectral image super resolution via sparse representation and pre-trained deep model. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.110170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
|
33
|
Hyperspectral Image Super-Resolution Method Based on Spectral Smoothing Prior and Tensor Tubal Row-Sparse Representation. REMOTE SENSING 2022. [DOI: 10.3390/rs14092142] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Due to the limited hardware conditions, hyperspectral image (HSI) has a low spatial resolution, while multispectral image (MSI) can gain higher spatial resolution. Therefore, derived from the idea of fusion, we reconstructed HSI with high spatial resolution and spectral resolution from HSI and MSI and put forward an HSI Super-Resolution model based on Spectral Smoothing prior and Tensor tubal row-sparse representation, termed SSTSR. Foremost, nonlocal priors are applied to refine the super-resolution task into reconstructing each nonlocal clustering tensor. Then per nonlocal cluster tensor is decomposed into two sub tensors under the tensor t-prodcut framework, one sub-tensor is called tersor dictionary and the other is called tensor coefficient. Meanwhile, in the process of dictionary learning and sparse coding, spectral smoothing constraint is imposed on the tensor dictionary, and L1,1,2 norm based tubal row-sparse regularizer is enforced on the tensor coefficient to enhance the structured sparsity. With this model, the spatial similarity and spectral similarity of the nonlocal cluster tensor are fully utilized. Finally, the alternating direction method of multipliers (ADMM) was employed to optimize the solution of our method. Experiments on three simulated datasets and one real dataset show that our approach is superior to many advanced HSI super-resolution methods.
Collapse
|
34
|
Abstract
Hyperspectral image-anomaly detection (HSI-AD) has become one of the research hotspots in the field of remote sensing. Because HSI’s features of integrating image and spectrum provide a considerable data basis for abnormal object detection, HSI-AD has a huge application potential in HSI analysis. It is difficult to effectively extract a large number of nonlinear features contained in HSI data using traditional machine learning methods, and deep learning has incomparable advantages in the extraction of nonlinear features. Therefore, deep learning has been widely used in HSI-AD and has shown excellent performance. This review systematically summarizes the related reference of HSI-AD based on deep learning and classifies the corresponding methods into performance comparisons. Specifically, we first introduce the characteristics of HSI-AD and the challenges faced by traditional methods and introduce the advantages of deep learning in dealing with these problems. Then, we systematically review and classify the corresponding methods of HSI-AD. Finally, the performance of the HSI-AD method based on deep learning is compared on several mainstream data sets, and the existing challenges are summarized. The main purpose of this article is to give a more comprehensive overview of the HSI-AD method to provide a reference for future research work.
Collapse
|
35
|
Lai Z, Wei K, Fu Y. Deep plug-and-play prior for hyperspectral image restoration. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.01.057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
36
|
He W, Yao Q, Li C, Yokoya N, Zhao Q, Zhang H, Zhang L. Non-Local Meets Global: An Iterative Paradigm for Hyperspectral Image Restoration. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:2089-2107. [PMID: 32991278 DOI: 10.1109/tpami.2020.3027563] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Non-local low-rank tensor approximation has been developed as a state-of-the-art method for hyperspectral image (HSI) restoration, which includes the tasks of denoising, compressed HSI reconstruction and inpainting. Unfortunately, while its restoration performance benefits from more spectral bands, its runtime also substantially increases. In this paper, we claim that the HSI lies in a global spectral low-rank subspace, and the spectral subspaces of each full band patch group should lie in this global low-rank subspace. This motivates us to propose a unified paradigm combining the spatial and spectral properties for HSI restoration. The proposed paradigm enjoys performance superiority from the non-local spatial denoising and light computation complexity from the low-rank orthogonal basis exploration. An efficient alternating minimization algorithm with rank adaptation is developed. It is done by first solving a fidelity term-related problem for the update of a latent input image, and then learning a low-dimensional orthogonal basis and the related reduced image from the latent input image. Subsequently, non-local low-rank denoising is developed to refine the reduced image and orthogonal basis iteratively. Finally, the experiments on HSI denoising, compressed reconstruction, and inpainting tasks, with both simulated and real datasets, demonstrate its superiority with respect to state-of-the-art HSI restoration methods.
Collapse
|
37
|
Wang C, Zheng L. AI-Based Publicity Strategies for Medical Colleges: A Case Study of Healthcare Analysis. Front Public Health 2022; 9:832568. [PMID: 35198536 PMCID: PMC8858836 DOI: 10.3389/fpubh.2021.832568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 12/20/2021] [Indexed: 11/27/2022] Open
Abstract
The health status and cognition of undergraduates, especially the scientific concept of healthcare, are particularly important for the overall development of society and themselves. The survey shows that there is a significant lack of knowledge about healthcare among undergraduates in medical college, even among medical undergraduates, not to mention non-medical undergraduates. Therefore, it is a good way to publicize healthcare lectures or electives for undergraduates in medical college, which can strengthen undergraduates' cognition of healthcare and strengthen the concept of healthcare. In addition, undergraduates' emotional and mental state in healthcare lectures or electives can be analyzed to determine whether undergraduates have hidden illnesses and how well they understand the healthcare content. In this study, at first, a mental state recognition method of undergraduates in medical college based on data mining technology is proposed. Then, the vision-based expression and posture are used for expanding the channels of emotion recognition, and a dual-channel emotion recognition model based on artificial intelligence (AI) during healthcare lectures or electives in a medical college is proposed. Finally, the simulation is driven by TensorFlow with respect to mental state recognition of undergraduates in medical college and emotion recognition. The simulation results show that the recognition accuracy of mental state recognition of undergraduates in a medical college is more than 92%, and the rejection rate and misrecognition rate are very low, and false match rate and false non-match rate of mental state recognition is significantly better than the other three benchmarks. The emotion recognition of the dual-channel emotion recognition method is over 96%, which effectively integrates the emotional information expressed by facial expressions and postures.
Collapse
|
38
|
Abstract
Pansharpening is an important yet challenging remote sensing image processing task, which aims to reconstruct a high-resolution (HR) multispectral (MS) image by fusing a HR panchromatic (PAN) image and a low-resolution (LR) MS image. Though deep learning (DL)-based pansharpening methods have achieved encouraging performance, they are infeasible to fully utilize the deep semantic features and shallow contextual features in the process of feature fusion for a HR-PAN image and LR-MS image. In this paper, we propose an efficient full-depth feature fusion network (FDFNet) for remote sensing pansharpening. Specifically, we design three distinctive branches called PAN-branch, MS-branch, and fusion-branch, respectively. The features extracted from the PAN and MS branches will be progressively injected into the fusion branch at every different depth to make the information fusion more broad and comprehensive. With this structure, the low-level contextual features and high-level semantic features can be characterized and integrated adequately. Extensive experiments on reduced- and full-resolution datasets acquired from WorldView-3, QuickBird, and GaoFen-2 sensors demonstrate that the proposed FDFNet only with less than 100,000 parameters performs better than other detail injection-based proposals and several state-of-the-art approaches, both visually and quantitatively.
Collapse
|
39
|
Zhao XL, Yang JH, Ma TH, Jiang TX, Ng MK, Huang TZ. Tensor Completion via Complementary Global, Local, and Nonlocal Priors. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:984-999. [PMID: 34971534 DOI: 10.1109/tip.2021.3138325] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Completing missing entries in multidimensional visual data is a typical ill-posed problem that requires appropriate exploitation of prior information of the underlying data. Commonly used priors can be roughly categorized into three classes: global tensor low-rankness, local properties, and nonlocal self-similarity (NSS); most existing works utilize one or two of them to implement completion. Naturally, there arises an interesting question: can one concurrently make use of multiple priors in a unified way, such that they can collaborate with each other to achieve better performance? This work gives a positive answer by formulating a novel tensor completion framework which can simultaneously take advantage of the global-local-nonlocal priors. In the proposed framework, the tensor train (TT) rank is adopted to characterize the global correlation; meanwhile, two Plug-and-Play (PnP) denoisers, including a convolutional neural network (CNN) denoiser and the color block-matching and 3 D filtering (CBM3D) denoiser, are incorporated to preserve local details and exploit NSS, respectively. Then, we design a proximal alternating minimization algorithm to efficiently solve this model under the PnP framework. Under mild conditions, we establish the convergence guarantee of the proposed algorithm. Extensive experiments show that these priors organically benefit from each other to achieve state-of-the-art performance both quantitatively and qualitatively.
Collapse
|
40
|
Graph-based few-shot learning with transformed feature propagation and optimal class allocation. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.10.110] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
41
|
Hyperspectral Image Mixed Noise Removal Using Subspace Representation and Deep CNN Image Prior. REMOTE SENSING 2021. [DOI: 10.3390/rs13204098] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The ever-increasing spectral resolution of hyperspectral images (HSIs) is often obtained at the cost of a decrease in the signal-to-noise ratio (SNR) of the measurements. The decreased SNR reduces the reliability of measured features or information extracted from HSIs, thus calling for effective denoising techniques. This work aims to estimate clean HSIs from observations corrupted by mixed noise (containing Gaussian noise, impulse noise, and dead-lines/stripes) by exploiting two main characteristics of hyperspectral data, namely low-rankness in the spectral domain and high correlation in the spatial domain. We take advantage of the spectral low-rankness of HSIs by representing spectral vectors in an orthogonal subspace, which is learned from observed images by a new method. Subspace representation coefficients of HSIs are learned by solving an optimization problem plugged with an image prior extracted from a neural denoising network. The proposed method is evaluated on simulated and real HSIs. An exhaustive array of experiments and comparisons with state-of-the-art denoisers were carried out.
Collapse
|
42
|
Difference Curvature Multidimensional Network for Hyperspectral Image Super-Resolution. REMOTE SENSING 2021. [DOI: 10.3390/rs13173455] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In recent years, convolutional-neural-network-based methods have been introduced to the field of hyperspectral image super-resolution following their great success in the field of RGB image super-resolution. However, hyperspectral images appear different from RGB images in that they have high dimensionality, implying a redundancy in the high-dimensional space. Existing approaches struggle in learning the spectral correlation and spatial priors, leading to inferior performance. In this paper, we present a difference curvature multidimensional network for hyperspectral image super-resolution that exploits the spectral correlation to help improve the spatial resolution. Specifically, we introduce a multidimensional enhanced convolution (MEC) unit into the network to learn the spectral correlation through a self-attention mechanism. Meanwhile, it reduces the redundancy in the spectral dimension via a bottleneck projection to condense useful spectral features and reduce computations. To remove the unrelated information in high-dimensional space and extract the delicate texture features of a hyperspectral image, we design an additional difference curvature branch (DCB), which works as an edge indicator to fully preserve the texture information and eliminate the unwanted noise. Experiments on three publicly available datasets demonstrate that the proposed method can recover sharper images with minimal spectral distortion compared to state-of-the-art methods. PSNR/SAM is 0.3–0.5 dB/0.2–0.4 better than the second best methods.
Collapse
|
43
|
Abstract
In this paper, we propose a cross-direction and progressive network, termed CPNet, to solve the pan-sharpening problem. The full processing of information is the main characteristic of our model, which is reflected as follows: on the one hand, we process the source images in a cross-direction manner to obtain the source images of different scales as the input of the fusion modules at different stages, which maximizes the usage of multi-scale information in the source images; on the other hand, the progressive reconstruction loss is designed to boost the training of our network and avoid partial inactivation, while maintaining the consistency of the fused result with the ground truth. Since the extraction of the information from the source images and the reconstruction of the fused image is based on the entire image rather than a single type of information, there is little loss of partial spatial or spectral information due to insufficient information processing. Extensive experiments, including qualitative and quantitative comparisons demonstrate that our model can maintain more spatial and spectral information compared to the state-of-the-art pan-sharpening methods.
Collapse
|
44
|
Zha Z, Wen B, Yuan X, Zhou JT, Zhou J, Zhu C. Triply Complementary Priors for Image Restoration. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:5819-5834. [PMID: 34133279 DOI: 10.1109/tip.2021.3086049] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Recent works that utilized deep models have achieved superior results in various image restoration (IR) applications. Such approach is typically supervised, which requires a corpus of training images with distributions similar to the images to be recovered. On the other hand, the shallow methods, which are usually unsupervised remain promising performance in many inverse problems, e.g., image deblurring and image compressive sensing (CS), as they can effectively leverage nonlocal self-similarity priors of natural images. However, most of such methods are patch-based leading to the restored images with various artifacts due to naive patch aggregation in addition to the slow speed. Using either approach alone usually limits performance and generalizability in IR tasks. In this paper, we propose a joint low-rank and deep (LRD) image model, which contains a pair of triply complementary priors, namely, internal and external, shallow and deep, and non-local and local priors. We then propose a novel hybrid plug-and-play (H-PnP) framework based on the LRD model for IR. Following this, a simple yet effective algorithm is developed to solve the proposed H-PnP based IR problems. Extensive experimental results on several representative IR tasks, including image deblurring, image CS and image deblocking, demonstrate that the proposed H-PnP algorithm achieves favorable performance compared to many popular or state-of-the-art IR methods in terms of both objective and visual perception.
Collapse
|
45
|
Dong W, Zhou C, Wu F, Wu J, Shi G, Li X. Model-Guided Deep Hyperspectral Image Super-Resolution. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:5754-5768. [PMID: 33979283 DOI: 10.1109/tip.2021.3078058] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The trade-off between spatial and spectral resolution is one of the fundamental issues in hyperspectral images (HSI). Given the challenges of directly acquiring high-resolution hyperspectral images (HR-HSI), a compromised solution is to fuse a pair of images: one has high-resolution (HR) in the spatial domain but low-resolution (LR) in spectral-domain and the other vice versa. Model-based image fusion methods including pan-sharpening aim at reconstructing HR-HSI by solving manually designed objective functions. However, such hand-crafted prior often leads to inevitable performance degradation due to a lack of end-to-end optimization. Although several deep learning-based methods have been proposed for hyperspectral pan-sharpening, HR-HSI related domain knowledge has not been fully exploited, leaving room for further improvement. In this paper, we propose an iterative Hyperspectral Image Super-Resolution (HSISR) algorithm based on a deep HSI denoiser to leverage both domain knowledge likelihood and deep image prior. By taking the observation matrix of HSI into account during the end-to-end optimization, we show how to unfold an iterative HSISR algorithm into a novel model-guided deep convolutional network (MoG-DCN). The representation of the observation matrix by subnetworks also allows the unfolded deep HSISR network to work with different HSI situations, which enhances the flexibility of MoG-DCN. Extensive experimental results are reported to demonstrate that the proposed MoG-DCN outperforms several leading HSISR methods in terms of both implementation cost and visual quality. The code is available at https://see.xidian.edu.cn/faculty/wsdong/Projects/MoG-DCN.htm.
Collapse
|
46
|
Fusion of China ZY-1 02D Hyperspectral Data and Multispectral Data: Which Methods Should Be Used? REMOTE SENSING 2021. [DOI: 10.3390/rs13122354] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
ZY-1 02D is China’s first civil hyperspectral (HS) operational satellite, developed independently and successfully launched in 2019. It can collect HS data with a spatial resolution of 30 m, 166 spectral bands, a spectral range of 400~2500 nm, and a swath width of 60 km. Its competitive advantages over other on-orbit or planned satellites are its high spectral resolution and large swath width. Unfortunately, the relatively low spatial resolution may limit its applications. As a result, fusing ZY-1 02D HS data with high-spatial-resolution multispectral (MS) data is required to improve spatial resolution while maintaining spectral fidelity. This paper conducted a comprehensive evaluation study on the fusion of ZY-1 02D HS data with ZY-1 02D MS data (10-m spatial resolution), based on visual interpretation and quantitative metrics. Datasets from Hebei, China, were used in this experiment, and the performances of six common data fusion methods, namely Gram-Schmidt (GS), High Pass Filter (HPF), Nearest-Neighbor Diffusion (NND), Modified Intensity-Hue-Saturation (IHS), Wavelet Transform (Wavelet), and Color Normalized Sharping (Brovey), were compared. The experimental results show that: (1) HPF and GS methods are better suited for the fusion of ZY-1 02D HS Data and MS Data, (2) IHS and Brovey methods can well improve the spatial resolution of ZY-1 02D HS data but introduce spectral distortion, and (3) Wavelet and NND results have high spectral fidelity but poor spatial detail representation. The findings of this study could serve as a good reference for the practical application of ZY-1 02D HS data fusion.
Collapse
|
47
|
Coupled Convolutional Neural Network-Based Detail Injection Method for Hyperspectral and Multispectral Image Fusion. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app11010288] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In this paper, a detail-injection method based on a coupled convolutional neural network (CNN) is proposed for hyperspectral (HS) and multispectral (MS) image fusion with the goal of enhancing the spatial resolution of HS images. Owing to the excellent performance in spectral fidelity of the detail-injection model and the image spatial–spectral feature exploration ability of CNN, the proposed method utilizes a couple of CNN networks as the feature extraction method and learns details from the HS and MS images individually. By appending an additional convolutional layer, both the extracted features of two images are concatenated to predict the missing details of the anticipated HS image. Experiments on simulated and real HS and MS data show that compared with some state-of-the-art HS and MS image fusion methods, our proposed method achieves better fusion results, provides excellent spectrum preservation ability, and is easy to implement.
Collapse
|
48
|
Abstract
Multi-sensor data on the same area provide complementary information, which is helpful for improving the discrimination capability of classifiers. In this work, a novel multilevel structure extraction method is proposed to fuse multi-sensor data. This method is comprised of three steps: First, multilevel structure extraction is constructed by cascading morphological profiles and structure features, and is utilized to extract spatial information from multiple original images. Then, a low-rank model is adopted to integrate the extracted spatial information. Finally, a spectral classifier is employed to calculate class probabilities, and a maximum posteriori estimation model is used to decide the final labels. Experiments tested on three datasets including rural and urban scenes validate that the proposed approach can produce promising performance with regard to both subjective and objective qualities.
Collapse
|