1
|
Wu X, Cao ZH, Huang TZ, Deng LJ, Chanussot J, Vivone G. Fully-Connected Transformer for Multi-Source Image Fusion. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2025; 47:2071-2088. [PMID: 40031431 DOI: 10.1109/tpami.2024.3523364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Multi-source image fusion combines the information coming from multiple images into one data, thus improving imaging quality. This topic has aroused great interest in the community. How to integrate information from different sources is still a big challenge, although the existing self-attention based transformer methods can capture spatial and channel similarities. In this paper, we first discuss the mathematical concepts behind the proposed generalized self-attention mechanism, where the existing self-attentions are considered basic forms. The proposed mechanism employs multilinear algebra to drive the development of a novel fully-connected self-attention (FCSA) method to fully exploit local and non-local domain-specific correlations among multi-source images. Moreover, we propose a multi-source image representation embedding it into the FCSA framework as a non-local prior within an optimization problem. Some different fusion problems are unfolded into the proposed fully-connected transformer fusion network (FC-Former). More specifically, the concept of generalized self-attention can promote the potential development of self-attention. Hence, the FC-Former can be viewed as a network model unifying different fusion tasks. Compared with state-of-the-art methods, the proposed FC-Former method exhibits robust and superior performance, showing its capability of faithfully preserving information.
Collapse
|
2
|
Dian R, Liu Y, Li S. Spectral Super-Resolution via Deep Low-Rank Tensor Representation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:5140-5150. [PMID: 38466604 DOI: 10.1109/tnnls.2024.3359852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
Spectral super-resolution has attracted the attention of more researchers for obtaining hyperspectral images (HSIs) in a simpler and cheaper way. Although many convolutional neural network (CNN)-based approaches have yielded impressive results, most of them ignore the low-rank prior of HSIs resulting in huge computational and storage costs. In addition, the ability of CNN-based methods to capture the correlation of global information is limited by the receptive field. To surmount the problem, we design a novel low-rank tensor reconstruction network (LTRN) for spectral super-resolution. Specifically, we treat the features of HSIs as 3-D tensors with low-rank properties due to their spectral similarity and spatial sparsity. Then, we combine canonical-polyadic (CP) decomposition with neural networks to design an adaptive low-rank prior learning (ALPL) module that enables feature learning in a 1-D space. In this module, there are two core modules: the adaptive vector learning (AVL) module and the multidimensionwise multihead self-attention (MMSA) module. The AVL module is designed to compress an HSI into a 1-D space by using a vector to represent its information. The MMSA module is introduced to improve the ability to capture the long-range dependencies in the row, column, and spectral dimensions, respectively. Finally, our LTRN, mainly cascaded by several ALPL modules and feedforward networks (FFNs), achieves high-quality spectral super-resolution with fewer parameters. To test the effect of our method, we conduct experiments on two datasets: the CAVE dataset and the Harvard dataset. Experimental results show that our LTRN not only is as effective as state-of-the-art methods but also has fewer parameters. The code is available at https://github.com/renweidian/LTRN.
Collapse
|
3
|
Dong F, Xu Y, Shi Y, Feng Y, Ma Z, Li H, Zhang Z, Wang G, Chen Y, Xian J, Wang S, Wang S, Yi W. Spectral reconstruction from RGB image to hyperspectral image: Take the detection of glutamic acid index in beef as an example. Food Chem 2025; 463:141543. [PMID: 39395351 DOI: 10.1016/j.foodchem.2024.141543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 10/02/2024] [Accepted: 10/03/2024] [Indexed: 10/14/2024]
Abstract
The use of spectral reconstruction (SR) to recovery RGB images to full-scene hyperspectral image (HSI) is an important measure to achieve real-time and low-cost HSI applications. Taking the detection of glutamic acid index for 360 beef samples as an example, the feasibility of using 11 state-of-the-art reconstruction algorithms to achieve RGB to HSI in complex food systems was investigated. The multivariate correlation analysis was used to prove that RGB is a projection of three-channel comprehensive coverage wide-band information. The comprehensive quality attributes (PSNR-Params-FLOPS) was proposed to determine the optimal reconstruction model (MST++, MST, MIRNet, and MPRNet). Moreover, SSIM values and t-SNE were introduced to evaluate the consistency of the reconstruction results. Finally, Lightweight Transformer was used to establish the detection models of Raw-HSI, RGB and SR-HSI for the prediction of glutamic acid index for beef. The results showed that the MST++ model exhibited the best performance in SR, with RMSE, PSNR, and SSIM values of 0.015, 36.70, and 0.9253, respectively. Meanwhile, the prediction effect of MST++ (R2P = 0.8422 and RPD = 2.46) reconstructed was close to the Raw-HSI (R2P = 0.8526 and RPD = 2.69). The results provide practical application scenarios and detailed analysis ideas for RGB-to-HSI.
Collapse
Affiliation(s)
- Fujia Dong
- School of Food Science and Engineering, Ningxia University, Yinchuan 750021, China; College of Mechanical and Electrical Engineering, Shihezi University, Shihezi 832003, China
| | - Ying Xu
- College of Mechanical and Electrical Engineering, Shihezi University, Shihezi 832003, China
| | - Yingkun Shi
- School of Food Science and Engineering, Ningxia University, Yinchuan 750021, China
| | - Yingjie Feng
- School of Food Science and Engineering, Ningxia University, Yinchuan 750021, China
| | - Zhaoyang Ma
- School of Food Science and Engineering, Ningxia University, Yinchuan 750021, China
| | - Hui Li
- School of Food Science and Engineering, Ningxia University, Yinchuan 750021, China
| | - Zhongxiong Zhang
- School of Food Science and Engineering, Ningxia University, Yinchuan 750021, China
| | - Guangxian Wang
- School of Food Science and Engineering, Ningxia University, Yinchuan 750021, China
| | - Yue Chen
- School of Food Science and Engineering, Ningxia University, Yinchuan 750021, China
| | - Jinhua Xian
- School of Food Science and Engineering, Ningxia University, Yinchuan 750021, China
| | - Shichang Wang
- College of Mechanical and Electrical Engineering, Shihezi University, Shihezi 832003, China
| | - Songlei Wang
- School of Food Science and Engineering, Ningxia University, Yinchuan 750021, China.
| | - Weiguo Yi
- School of Food Science and Engineering, Ningxia University, Yinchuan 750021, China.
| |
Collapse
|
4
|
Li J, Du S, Song R, Li Y, Du Q. Progressive Spatial Information-Guided Deep Aggregation Convolutional Network for Hyperspectral Spectral Super-Resolution. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:1677-1691. [PMID: 37889820 DOI: 10.1109/tnnls.2023.3325682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/29/2023]
Abstract
Fusion-based spectral super-resolution aims to yield a high-resolution hyperspectral image (HR-HSI) by integrating the available high-resolution multispectral image (HR-MSI) with the corresponding low-resolution hyperspectral image (LR-HSI). With the prosperity of deep convolutional neural networks, plentiful fusion methods have made breakthroughs in reconstruction performance promotions. Nevertheless, due to inadequate and improper utilization of cross-modality information, the most current state-of-the-art (SOTA) fusion-based methods cannot produce very satisfactory recovery quality and only yield desired results with a small upsampling scale, thus affecting the practical applications. In this article, we propose a novel progressive spatial information-guided deep aggregation convolutional neural network (SIGnet) for enhancing the performance of hyperspectral image (HSI) spectral super-resolution (SSR), which is decorated through several dense residual channel affinity learning (DRCA) blocks cooperating with a spatial-guided propagation (SGP) module as the backbone. Specifically, the DRCA block consists of an encoding part and a decoding part connected by a channel affinity propagation (CAP) module and several cross-layer skip connections. In detail, the CAP module is customized by exploiting the channel affinity matrix to model correlations among channels of the feature maps for aggregating the channel-wise interdependencies of the middle layers, thereby further boosting the reconstruction accuracy. Additionally, to efficiently utilize the two cross-modality information, we developed an innovative SGP module equipped with a simulation of the degradation part and a deformable adaptive fusion part, which is capable of refining the coarse HSI feature maps at pixel-level progressively. Extensive experimental results demonstrate the superiority of our proposed SIGnet over several SOTA fusion-based algorithms.
Collapse
|
5
|
Zhao Y, Po LM, Lin T, Yan Q, Liu W, Xian P. HSGAN: Hyperspectral Reconstruction From RGB Images With Generative Adversarial Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:17137-17150. [PMID: 37561623 DOI: 10.1109/tnnls.2023.3300099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/12/2023]
Abstract
Hyperspectral (HS) reconstruction from RGB images denotes the recovery of whole-scene HS information, which has attracted much attention recently. State-of-the-art approaches often adopt convolutional neural networks to learn the mapping for HS reconstruction from RGB images. However, they often do not achieve high HS reconstruction performance across different scenes consistently. In addition, their performance in recovering HS images from clean and real-world noisy RGB images is not consistent. To improve the HS reconstruction accuracy and robustness across different scenes and from different input images, we present an effective HSGAN framework with a two-stage adversarial training strategy. The generator is a four-level top-down architecture that extracts and combines features on multiple scales. To generalize well to real-world noisy images, we further propose a spatial-spectral attention block (SSAB) to learn both spatial-wise and channel-wise relations. We conduct the HS reconstruction experiments from both clean and real-world noisy RGB images on five well-known HS datasets. The results demonstrate that HSGAN achieves superior performance to existing methods. Please visit https://github.com/zhaoyuzhi/HSGAN to try our codes.
Collapse
|
6
|
Leung JH, Karmakar R, Mukundan A, Thongsit P, Chen MM, Chang WY, Wang HC. Systematic Meta-Analysis of Computer-Aided Detection of Breast Cancer Using Hyperspectral Imaging. Bioengineering (Basel) 2024; 11:1060. [PMID: 39593720 PMCID: PMC11591395 DOI: 10.3390/bioengineering11111060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2024] [Revised: 10/19/2024] [Accepted: 10/21/2024] [Indexed: 11/28/2024] Open
Abstract
The most commonly occurring cancer in the world is breast cancer with more than 500,000 cases across the world. The detection mechanism for breast cancer is endoscopist-dependent and necessitates a skilled pathologist. However, in recent years many computer-aided diagnoses (CADs) have been used to diagnose and classify breast cancer using traditional RGB images that analyze the images only in three-color channels. Nevertheless, hyperspectral imaging (HSI) is a pioneering non-destructive testing (NDT) image-processing technique that can overcome the disadvantages of traditional image processing which analyzes the images in a wide-spectrum band. Eight studies were selected for systematic diagnostic test accuracy (DTA) analysis based on the results of the Quadas-2 tool. Each of these studies' techniques is categorized according to the ethnicity of the data, the methodology employed, the wavelength that was used, the type of cancer diagnosed, and the year of publication. A Deeks' funnel chart, forest charts, and accuracy plots were created. The results were statistically insignificant, and there was no heterogeneity among these studies. The methods and wavelength bands that were used with HSI technology to detect breast cancer provided high sensitivity, specificity, and accuracy. The meta-analysis of eight studies on breast cancer diagnosis using HSI methods reported average sensitivity, specificity, and accuracy of 78%, 89%, and 87%, respectively. The highest sensitivity and accuracy were achieved with SVM (95%), while CNN methods were the most commonly used but had lower sensitivity (65.43%). Statistical analyses, including meta-regression and Deeks' funnel plots, showed no heterogeneity among the studies and highlighted the evolving performance of HSI techniques, especially after 2019.
Collapse
Affiliation(s)
- Joseph-Hang Leung
- Department of Radiology, Ditmanson Medical Foundation Chia-Yi Christian Hospital, Chiayi City 600566, Taiwan;
| | - Riya Karmakar
- Department of Mechanical Engineering, National Chung Cheng University, 168, University Rd., Min Hsiung, Chiayi City 62102, Taiwan; (R.K.); (A.M.)
| | - Arvind Mukundan
- Department of Mechanical Engineering, National Chung Cheng University, 168, University Rd., Min Hsiung, Chiayi City 62102, Taiwan; (R.K.); (A.M.)
| | - Pacharasak Thongsit
- Faculty of Mechanical Engineering, King Mongkut’s University of Technology North Bangkok, Pracharat 1 Road, Wongsawang, Bangsue, Bangkok 10800, Thailand;
| | - Meei-Maan Chen
- Center for Innovative Research on Aging Society (CIRAS), National Chung Cheng University, 168, University Rd., Min Hsiung, Chiayi 62102, Taiwan;
| | - Wen-Yen Chang
- Department of General Surgery, Kaohsiung Armed Forces General Hospital, 2, Zhongzheng 1st.Rd., Lingya District, Kaohsiung City 80284, Taiwan
| | - Hsiang-Chen Wang
- Department of Mechanical Engineering, National Chung Cheng University, 168, University Rd., Min Hsiung, Chiayi City 62102, Taiwan; (R.K.); (A.M.)
- Department of Medical Research, Dalin Tzu Chi Hospital, Buddhist Tzu Chi Medical Foundation, No. 2, Minsheng Road, Dalin, Chiayi 62247, Taiwan
- Hitspectra Intelligent Technology Co., Ltd., 4F., No. 2, Fuxing 4th Rd., Qianzhen Dist., Kaohsiung City 80661, Taiwan
| |
Collapse
|
7
|
Dian R, Shan T, He W, Liu H. Spectral Super-Resolution via Model-Guided Cross-Fusion Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:10059-10070. [PMID: 37022225 DOI: 10.1109/tnnls.2023.3238506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Spectral super-resolution, which reconstructs a hyperspectral image (HSI) from a single red-green-blue (RGB) image, has acquired more and more attention. Recently, convolution neural networks (CNNs) have achieved promising performance. However, they often fail to simultaneously exploit the imaging model of the spectral super-resolution and complex spatial and spectral characteristics of the HSI. To tackle the above problems, we build a novel cross fusion (CF)-based model-guided network (called SSRNet) for spectral super-resolution. In specific, based on the imaging model, we unfold the spectral super-resolution into the HSI prior learning (HPL) module and imaging model guiding (IMG) module. Instead of just modeling one kind of image prior, the HPL module is composed of two subnetworks with different structures, which can effectively learn the complex spatial and spectral priors of the HSI, respectively. Furthermore, a CF strategy is used to establish the connection between the two subnetworks, which further improves the learning performance of the CNN. The IMG module results in solving a strong convex optimization problem, which adaptively optimizes and merges the two features learned by the HPL module by exploiting the imaging model. The two modules are alternately connected to achieve optimal HSI reconstruction performance. Experiments on both the simulated and real data demonstrate that the proposed method can achieve superior spectral reconstruction results with relatively small model size. The code will be available at https://github.com/renweidian.
Collapse
|
8
|
Wang Y, Gu Y, Nanding A. SSTU: Swin-Spectral Transformer U-Net for hyperspectral whole slide image reconstruction. Comput Med Imaging Graph 2024; 114:102367. [PMID: 38522221 DOI: 10.1016/j.compmedimag.2024.102367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 02/02/2024] [Accepted: 03/06/2024] [Indexed: 03/26/2024]
Abstract
Whole Slide Imaging and Hyperspectral Microscopic Imaging provide great quality data with high spatial and spectral resolution for histopathology. Existing Hyperspectral Whole Slide Imaging systems combine the advantages of the techniques above, thus providing rich information for pathological diagnosis. However, it cannot avoid the problems of slow acquisition speed and mass data storage demand. Inspired by the spectral reconstruction task in computer vision and remote sensing, the Swin-Spectral Transformer U-Net (SSTU) has been developed to reconstruct Hyperspectral Whole Slide images (HWSis) from multiple Hyperspectral Microscopic images (HMis) of small Field of View and Whole Slide images (WSis). The Swin-Spectral Transformer (SST) module in SSTU takes full advantage of Transformer in extracting global attention. Firstly, Swin Transformer is exploited in space domain, which overcomes the high computation cost in Vision Transformer structures, while it maintains the spatial features extracted from WSis. Furthermore, Spectral Transformer is exploited to collect the long-range spectral features in HMis. Combined with the multi-scale encoder-bottleneck-decoder structure of U-Net, SSTU network is formed by sequential and symmetric residual connections of SSTs, which reconstructs a selected area of HWSi from coarse to fine. Qualitative and quantitative experiments prove the performance of SSTU in HWSi reconstruction task superior to other state-of-the-art spectral reconstruction methods.
Collapse
Affiliation(s)
- Yukun Wang
- School of Electronic and Information Engineering, Harbin Institute of Technology, Harbin 150001, China
| | - Yanfeng Gu
- School of Electronic and Information Engineering, Harbin Institute of Technology, Harbin 150001, China.
| | - Abiyasi Nanding
- Department of Pathology, Harbin Medical University Cancer Hospital, Harbin 150040, China
| |
Collapse
|
9
|
Huo D, Wang J, Qian Y, Yang YH. Learning to Recover Spectral Reflectance From RGB Images. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:3174-3186. [PMID: 38687649 DOI: 10.1109/tip.2024.3393390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2024]
Abstract
This paper tackles spectral reflectance recovery (SRR) from RGB images. Since capturing ground-truth spectral reflectance and camera spectral sensitivity are challenging and costly, most existing approaches are trained on synthetic images and utilize the same parameters for all unseen testing images, which are suboptimal especially when the trained models are tested on real images because they never exploit the internal information of the testing images. To address this issue, we adopt a self-supervised meta-auxiliary learning (MAXL) strategy that fine-tunes the well-trained network parameters with each testing image to combine external with internal information. To the best of our knowledge, this is the first work that successfully adapts the MAXL strategy to this problem. Instead of relying on naive end-to-end training, we also propose a novel architecture that integrates the physical relationship between the spectral reflectance and the corresponding RGB images into the network based on our mathematical analysis. Besides, since the spectral reflectance of a scene is independent to its illumination while the corresponding RGB images are not, we recover the spectral reflectance of a scene from its RGB images captured under multiple illuminations to further reduce the unknown. Qualitative and quantitative evaluations demonstrate the effectiveness of our proposed network and of the MAXL. Our code and data are available at https://github.com/Dong-Huo/SRR-MAXL.
Collapse
|
10
|
Kior A, Yudina L, Zolin Y, Sukhov V, Sukhova E. RGB Imaging as a Tool for Remote Sensing of Characteristics of Terrestrial Plants: A Review. PLANTS (BASEL, SWITZERLAND) 2024; 13:1262. [PMID: 38732477 PMCID: PMC11085576 DOI: 10.3390/plants13091262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/25/2024] [Revised: 04/28/2024] [Accepted: 04/29/2024] [Indexed: 05/13/2024]
Abstract
Approaches for remote sensing can be used to estimate the influence of changes in environmental conditions on terrestrial plants, providing timely protection of their growth, development, and productivity. Different optical methods, including the informative multispectral and hyperspectral imaging of reflected light, can be used for plant remote sensing; however, multispectral and hyperspectral cameras are technically complex and have a high cost. RGB imaging based on the analysis of color images of plants is definitely simpler and more accessible, but using this tool for remote sensing plant characteristics under changeable environmental conditions requires the development of methods to increase its informativity. Our review focused on using RGB imaging for remote sensing the characteristics of terrestrial plants. In this review, we considered different color models, methods of exclusion of background in color images of plant canopies, and various color indices and their relations to characteristics of plants, using regression models, texture analysis, and machine learning for the estimation of these characteristics based on color images, and some approaches to provide transformation of simple color images to hyperspectral and multispectral images. As a whole, our review shows that RGB imaging can be an effective tool for estimating plant characteristics; however, further development of methods to analyze color images of plants is necessary.
Collapse
Affiliation(s)
| | | | | | | | - Ekaterina Sukhova
- Department of Biophysics, N.I. Lobachevsky State University of Nizhny Novgorod, 603950 Nizhny Novgorod, Russia; (A.K.); (L.Y.); (Y.Z.); (V.S.)
| |
Collapse
|
11
|
Zeng S, Wang M, Jia H, Hu J, Li J. Multi-feature sparse representation based on adaptive graph constraint for cropland delineation. OPTICS EXPRESS 2024; 32:6463-6480. [PMID: 38439348 DOI: 10.1364/oe.506934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Accepted: 11/27/2023] [Indexed: 03/06/2024]
Abstract
Cropland delineation is the basis of agricultural resource surveys and many algorithms for plot identification have been studied. However, there is still a vacancy in SRC for cropland delineation with the high-dimensional data extracted from UAV RGB photographs. In order to address this problem, a new sparsity-based classification algorithm is proposed. Firstly, the multi-feature association sparse model is designed by extracting the multi-feature of UAV RGB photographs. Next, the samples with similar characteristics are hunted with the breadth-first principle to construct a shape-adaptive window for each test. Finally, an algorithm, multi-feature sparse representation based on adaptive graph constraint (AMFSR), is obtained by solving the optimal objective iteratively. Experimental results show that the overall accuracy (OA) of AMFSR reaches 92.3546% and the Kappa is greater than 0.8. Furthermore, experiments have demonstrated that the model also has a generalization ability.
Collapse
|
12
|
Zhu L, Wu J, Biao W, Liao Y, Gu D. SpectralMAE: Spectral Masked Autoencoder for Hyperspectral Remote Sensing Image Reconstruction. SENSORS (BASEL, SWITZERLAND) 2023; 23:3728. [PMID: 37050788 PMCID: PMC10099040 DOI: 10.3390/s23073728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 03/28/2023] [Accepted: 03/31/2023] [Indexed: 06/19/2023]
Abstract
Accurate hyperspectral remote sensing information is essential for feature identification and detection. Nevertheless, the hyperspectral imaging mechanism poses challenges in balancing the trade-off between spatial and spectral resolution. Hardware improvements are cost-intensive and depend on strict environmental conditions and extra equipment. Recent spectral imaging methods have attempted to directly reconstruct hyperspectral information from widely available multispectral images. However, fixed mapping approaches used in previous spectral reconstruction models limit their reconstruction quality and generalizability, especially dealing with missing or contaminated bands. Moreover, data-hungry issues plague increasingly complex data-driven spectral reconstruction methods. This paper proposes SpectralMAE, a novel spectral reconstruction model that can take arbitrary combinations of bands as input and improve the utilization of data sources. In contrast to previous spectral reconstruction techniques, SpectralMAE explores the application of a self-supervised learning paradigm and proposes a masked autoencoder architecture for spectral dimensions. To further enhance the performance for specific sensor inputs, we propose a training strategy by combining random masking pre-training and fixed masking fine-tuning. Empirical evaluations on five remote sensing datasets demonstrate that SpectralMAE outperforms state-of-the-art methods in both qualitative and quantitative metrics.
Collapse
Affiliation(s)
- Lingxuan Zhu
- School of Electronic Engineering, Xidian University, Xi’an 710071, China; (J.W.)
- National Key Laboratory of Scattering and Radiation, Shanghai 200438, China
| | - Jiaji Wu
- School of Electronic Engineering, Xidian University, Xi’an 710071, China; (J.W.)
| | - Wang Biao
- School of Electronic Engineering, Xidian University, Xi’an 710071, China; (J.W.)
| | - Yi Liao
- National Key Laboratory of Scattering and Radiation, Shanghai 200438, China
| | - Dandan Gu
- National Key Laboratory of Scattering and Radiation, Shanghai 200438, China
| |
Collapse
|
13
|
Yao P, Wu H, Xin JH. Improving Generalizability of Spectral Reflectance Reconstruction Using L1-Norm Penalization. SENSORS (BASEL, SWITZERLAND) 2023; 23:689. [PMID: 36679486 PMCID: PMC9861650 DOI: 10.3390/s23020689] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/13/2022] [Revised: 12/28/2022] [Accepted: 01/02/2023] [Indexed: 06/17/2023]
Abstract
Spectral reflectance reconstruction for multispectral images (such as Weiner estimation) may perform sub-optimally when the object being measured has a texture that is not in the training set. The accuracy of the reconstruction is significantly lower without training samples. We propose an improved reflectance reconstruction method based on L1-norm penalization to solve this issue. Using L1-norm, our method can provide the transformation matrix with the favorable sparse property, which can help to achieve better results when measuring the unseen samples. We verify the proposed method by reconstructing spectral reflection for four types of materials (cotton, paper, polyester, and nylon) captured by a multispectral imaging system. Each of the materials has its texture and there are 204 samples in each of the materials/textures in the experiments. The experimental results show that when the texture is not included in the training dataset, L1-norm can achieve better results compared with existing methods using colorimetric measure (i.e., color difference) and shows consistent accuracy across four kinds of materials.
Collapse
Affiliation(s)
- Pengpeng Yao
- Zhuhai Fudan Innovation Institute, Zhuhai 519000, China
- School of Fashion and Textile, The Hong Kong Polytechnic University, Hong Kong, China
| | - Hochung Wu
- School of Fashion and Textile, The Hong Kong Polytechnic University, Hong Kong, China
| | - John H. Xin
- School of Fashion and Textile, The Hong Kong Polytechnic University, Hong Kong, China
| |
Collapse
|
14
|
Fu J, Liu J, Zhao R, Chen Z, Qiao Y, Li D. Maize disease detection based on spectral recovery from RGB images. FRONTIERS IN PLANT SCIENCE 2022; 13:1056842. [PMID: 36618618 PMCID: PMC9811593 DOI: 10.3389/fpls.2022.1056842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 11/23/2022] [Indexed: 06/17/2023]
Abstract
Maize is susceptible to infect pest disease, and early disease detection is key to preventing the reduction of maize yields. The raw data used for plant disease detection are commonly RGB images and hyperspectral images (HSI). RGB images can be acquired rapidly and low-costly, but the detection accuracy is not satisfactory. On the contrary, using HSIs tends to obtain higher detection accuracy, but HSIs are difficult and high-cost to obtain in field. To overcome this contradiction, we have proposed the maize spectral recovery disease detection framework which includes two parts: the maize spectral recovery network based on the advanced hyperspectral recovery convolutional neural network (HSCNN+) and the maize disease detection network based on the convolutional neural network (CNN). Taking raw RGB data as input of the framework, the output reconstructed HSIs are used as input of disease detection network to achieve disease detection task. As a result, the detection accuracy obtained by using the low-cost raw RGB data almost as same as that obtained by using HSIs directly. The HSCNN+ is found to be fit to our spectral recovery model and the reconstruction fidelity was satisfactory. Experimental results demonstrate that the reconstructed HSIs efficiently improve detection accuracy compared with raw RGB image in tested scenarios, especially in complex environment scenario, for which the detection accuracy increases by 6.14%. The proposed framework has the advantages of fast, low cost and high detection precision. Moreover, the framework offers the possibility of real-time and precise field disease detection and can be applied in agricultural robots.
Collapse
Affiliation(s)
- Jun Fu
- College of Biological and Agricultural Engineering, Jilin University, Changchun, China
- Key Laboratory of Efficient Sowing and Harvesting Equipment, Ministry of Agriculture and Rural Affairs, Jilin University, Changchun, China
- Key Laboratory of Bionic Engineering, Ministry of Education, Jilin University, Changchun, China
| | - Jindai Liu
- College of Biological and Agricultural Engineering, Jilin University, Changchun, China
- Key Laboratory of Efficient Sowing and Harvesting Equipment, Ministry of Agriculture and Rural Affairs, Jilin University, Changchun, China
| | - Rongqiang Zhao
- College of Biological and Agricultural Engineering, Jilin University, Changchun, China
- Key Laboratory of Efficient Sowing and Harvesting Equipment, Ministry of Agriculture and Rural Affairs, Jilin University, Changchun, China
| | - Zhi Chen
- College of Biological and Agricultural Engineering, Jilin University, Changchun, China
- Department of Science and Technology Development, Chinese Academy of Agricultural Mechanization Sciences, Beijing, China
| | - Yongliang Qiao
- Australian Centre for Field Robotics (ACFR), Faculty of Engineering, The University of Sydney, Sydney, NSW, Australia
| | - Dan Li
- College of Astronautics, Nanjing University of Aeronautics and Astronautics, Nanjing, China
| |
Collapse
|
15
|
He J, Li J, Yuan Q, Shen H, Zhang L. Spectral Response Function-Guided Deep Optimization-Driven Network for Spectral Super-Resolution. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:4213-4227. [PMID: 33600324 DOI: 10.1109/tnnls.2021.3056181] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Hyperspectral images (HSIs) are crucial for many research works. Spectral super-resolution (SSR) is a method used to obtain high-spatial-resolution (HR) HSIs from HR multispectral images. Traditional SSR methods include model-driven algorithms and deep learning. By unfolding a variational method, this article proposes an optimization-driven convolutional neural network (CNN) with a deep spatial-spectral prior, resulting in physically interpretable networks. Unlike the fully data-driven CNN, auxiliary spectral response function (SRF) is utilized to guide CNNs to group the bands with spectral relevance. In addition, the channel attention module (CAM) and the reformulated spectral angle mapper loss function are applied to achieve an effective reconstruction model. Finally, experiments on two types of data sets, including natural and remote sensing images, demonstrate the spectral enhancement effect of the proposed method, and also, the classification results on the remote sensing data set verified the validity of the information enhanced by the proposed method.
Collapse
|
16
|
Xiong F, Zhou J, Tao S, Lu J, Zhou J, Qian Y. SMDS-Net: Model Guided Spectral-Spatial Network for Hyperspectral Image Denoising. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:5469-5483. [PMID: 35951563 DOI: 10.1109/tip.2022.3196826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Deep learning (DL) based hyperspectral images (HSIs) denoising approaches directly learn the nonlinear mapping between noisy and clean HSI pairs. They usually do not consider the physical characteristics of HSIs. This drawback makes the models lack interpretability that is key to understanding their denoising mechanism and limits their denoising ability. In this paper, we introduce a novel model-guided interpretable network for HSI denoising to tackle this problem. Fully considering the spatial redundancy, spectral low-rankness, and spectral-spatial correlations of HSIs, we first establish a subspace-based multidimensional sparse (SMDS) model under the umbrella of tensor notation. After that, the model is unfolded into an end-to-end network named SMDS-Net, whose fundamental modules are seamlessly connected with the denoising procedure and optimization of the SMDS model. This makes SMDS-Net convey clear physical meanings, i.e., learning the low-rankness and sparsity of HSIs. Finally, all key variables are obtained by discriminative training. Extensive experiments and comprehensive analysis on synthetic and real-world HSIs confirm the strong denoising ability, strong learning capability, promising generalization ability, and high interpretability of SMDS-Net against the state-of-the-art HSI denoising methods. The source code and data of this article will be made publicly available at https://github.com/bearshng/smds-net for reproducible research.
Collapse
|
17
|
Zhang J, Su R, Fu Q, Ren W, Heide F, Nie Y. A survey on computational spectral reconstruction methods from RGB to hyperspectral imaging. Sci Rep 2022; 12:11905. [PMID: 35831474 PMCID: PMC9279412 DOI: 10.1038/s41598-022-16223-1] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Accepted: 07/06/2022] [Indexed: 11/30/2022] Open
Abstract
Hyperspectral imaging enables many versatile applications for its competence in capturing abundant spatial and spectral information, which is crucial for identifying substances. However, the devices for acquiring hyperspectral images are typically expensive and very complicated, hindering the promotion of their application in consumer electronics, such as daily food inspection and point-of-care medical screening, etc. Recently, many computational spectral imaging methods have been proposed by directly reconstructing the hyperspectral information from widely available RGB images. These reconstruction methods can exclude the usage of burdensome spectral camera hardware while keeping a high spectral resolution and imaging performance. We present a thorough investigation of more than 25 state-of-the-art spectral reconstruction methods which are categorized as prior-based and data-driven methods. Simulations on open-source datasets show that prior-based methods are more suitable for rare data situations, while data-driven methods can unleash the full potential of deep learning in big data cases. We have identified current challenges faced by those methods (e.g., loss function, spectral accuracy, data generalization) and summarized a few trends for future work. With the rapid expansion in datasets and the advent of more advanced neural networks, learnable methods with fine feature representation abilities are very promising. This comprehensive review can serve as a fruitful reference source for peer researchers, thus paving the way for the development of computational hyperspectral imaging.
Collapse
Affiliation(s)
- Jingang Zhang
- School of Future Technology, University of Chinese Academy of Sciences, Beijing, 100039, China
| | - Runmu Su
- School of Future Technology, University of Chinese Academy of Sciences, Beijing, 100039, China
- School of Computer Science and Technology, Xidian University, Xi'an, 710071, China
| | - Qiang Fu
- King Abdullah University of Science and Technology, Thuwal, 23955-6900, Saudi Arabia
| | - Wenqi Ren
- State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, 100093, China
| | - Felix Heide
- Computational Imaging Lab, Princeton University, Princeton, NJ, 08544, USA
| | - Yunfeng Nie
- Department of Applied Physics and Photonics, Vrije Universiteit Brussel, 1050, Brussels, Belgium.
| |
Collapse
|
18
|
Xu L, Zhou B, Li X, Wu Z, Chen Y, Wang X, Tang Y. Gaussian process image classification based on multi-layer convolution kernel function. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.01.048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
19
|
Fu Y, Zhang T, Zheng Y, Zhang D, Huang H. Joint Camera Spectral Response Selection and Hyperspectral Image Recovery. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:256-272. [PMID: 32750820 DOI: 10.1109/tpami.2020.3009999] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Hyperspectral image (HSI) recovery from a single RGB image has attracted much attention, whose performance has recently been shown to be sensitive to the camera spectral response (CSR). In this paper, we present an efficient convolutional neural network (CNN) based method, which can jointly select the optimal CSR from a candidate dataset and learn a mapping to recover HSI from a single RGB image captured with this algorithmically selected camera under multi-chip or single-chip setups. Given a specific CSR, we first present a HSI recovery network, which accounts for the underlying characteristics of the HSI, including spectral nonlinear mapping and spatial similarity. Later, we append a CSR selection layer onto the recovery network, and the optimal CSR under both multi-chip and single-chip setups can thus be automatically determined from the network weights under the nonnegative sparse constraint. Experimental results on three hyperspectral datasets and two camera spectral response datasets demonstrate that our HSI recovery network outperforms state-of-the-art methods in terms of both quantitative metrics and perceptive quality, and the selection layer always returns a CSR consistent to the best one determined by exhaustive search. Finally, we show that our method can also perform well in the real capture system, and collect a hyperspectral flower dataset to evaluate the effect from HSI recovery on classification problem.
Collapse
|
20
|
|
21
|
Hyperspectral Image Super-Resolution with Self-Supervised Spectral-Spatial Residual Network. REMOTE SENSING 2021. [DOI: 10.3390/rs13071260] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Recently, many convolutional networks have been built to fuse a low spatial resolution (LR) hyperspectral image (HSI) and a high spatial resolution (HR) multispectral image (MSI) to obtain HR HSIs. However, most deep learning-based methods are supervised methods, which require sufficient HR HSIs for supervised training. Collecting plenty of HR HSIs is laborious and time-consuming. In this paper, a self-supervised spectral-spatial residual network (SSRN) is proposed to alleviate dependence on a mass of HR HSIs. In SSRN, the fusion of HR MSIs and LR HSIs is considered a pixel-wise spectral mapping problem. Firstly, this paper assumes that the spectral mapping between HR MSIs and HR HSIs can be approximated by the spectral mapping between LR MSIs (derived from HR MSIs) and LR HSIs. Secondly, the spectral mapping between LR MSIs and LR HSIs is explored by SSRN. Finally, a self-supervised fine-tuning strategy is proposed to transfer the learned spectral mapping to generate HR HSIs. SSRN does not require HR HSIs as the supervised information in training. Simulated and real hyperspectral databases are utilized to verify the performance of SSRN.
Collapse
|
22
|
Double Ghost Convolution Attention Mechanism Network: A Framework for Hyperspectral Reconstruction of a Single RGB Image. SENSORS 2021; 21:s21020666. [PMID: 33477959 PMCID: PMC7835855 DOI: 10.3390/s21020666] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 01/08/2021] [Accepted: 01/11/2021] [Indexed: 12/03/2022]
Abstract
Current research on the reconstruction of hyperspectral images from RGB images using deep learning mainly focuses on learning complex mappings through deeper and wider convolutional neural networks (CNNs). However, the reconstruction accuracy of the hyperspectral image is not high and among other issues the model for generating these images takes up too much storage space. In this study, we propose the double ghost convolution attention mechanism network (DGCAMN) framework for the reconstruction of a single RGB image to improve the accuracy of spectral reconstruction and reduce the storage occupied by the model. The proposed DGCAMN consists of a double ghost residual attention block (DGRAB) module and optimal nonlocal block (ONB). DGRAB module uses GhostNet and PRELU activation functions to reduce the calculation parameters of the data and reduce the storage size of the generative model. At the same time, the proposed double output feature Convolutional Block Attention Module (DOFCBAM) is used to capture the texture details on the feature map to maximize the content of the reconstructed hyperspectral image. In the proposed ONB, the Argmax activation function is used to obtain the region with the most abundant feature information and maximize the most useful feature parameters. This helps to improve the accuracy of spectral reconstruction. These contributions enable the DGCAMN framework to achieve the highest spectral accuracy with minimal storage consumption. The proposed method has been applied to the NTIRE 2020 dataset. Experimental results show that the proposed DGCAMN method outperforms the spectral accuracy reconstructed by advanced deep learning methods and greatly reduces storage consumption.
Collapse
|
23
|
Marques Junior A, de Souza EM, Müller M, Brum D, Zanotta DC, Horota RK, Kupssinskü LS, Veronez MR, Gonzaga L, Cazarin CL. Improving Spatial Resolution of Multispectral Rock Outcrop Images Using RGB Data and Artificial Neural Networks. SENSORS (BASEL, SWITZERLAND) 2020; 20:s20123559. [PMID: 32586025 PMCID: PMC7349106 DOI: 10.3390/s20123559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/20/2020] [Revised: 06/08/2020] [Accepted: 06/16/2020] [Indexed: 06/11/2023]
Abstract
Spectral information provided by multispectral and hyperspectral sensors has a great impact on remote sensing studies, easing the identification of carbonate outcrops that contribute to a better understanding of petroleum reservoirs. Sensors aboard satellites like Landsat series, which have data freely available usually lack the spatial resolution that suborbital sensors have. Many techniques have been developed to improve spatial resolution through data fusion. However, most of them have serious limitations regarding application and scale. Recently Super-Resolution (SR) convolution neural networks have been tested with encouraging results. However, they require large datasets, more time and computational power for training. To overcome these limitations, this work aims to increase the spatial resolution of multispectral bands from the Landsat satellite database using a modified artificial neural network that uses pixel kernels of a single spatial high-resolution RGB image from Google Earth as input. The methodology was validated with a common dataset of indoor images as well as a specific area of Landsat 8. Different downsized scale inputs were used for training where the validation used the ground truth of the original size images, obtaining comparable results to the recent works. With the method validated, we generated high spatial resolution spectral bands based on RGB images from Google Earth on a carbonated outcrop area, which were then properly classified according to the soil spectral responses making use of the advantage of a higher spatial resolution dataset.
Collapse
Affiliation(s)
- Ademir Marques Junior
- Vizlab|X-Reality and Geoinformatics Lab, Graduate Programme in Applied Computing, Unisinos University, São Leopoldo RS 93022-750, Brazil; (M.M.); (D.B.); (D.C.Z.); (R.K.H.); (L.S.K.); (M.R.V.); (L.G.J.)
| | | | - Marianne Müller
- Vizlab|X-Reality and Geoinformatics Lab, Graduate Programme in Applied Computing, Unisinos University, São Leopoldo RS 93022-750, Brazil; (M.M.); (D.B.); (D.C.Z.); (R.K.H.); (L.S.K.); (M.R.V.); (L.G.J.)
| | - Diego Brum
- Vizlab|X-Reality and Geoinformatics Lab, Graduate Programme in Applied Computing, Unisinos University, São Leopoldo RS 93022-750, Brazil; (M.M.); (D.B.); (D.C.Z.); (R.K.H.); (L.S.K.); (M.R.V.); (L.G.J.)
| | - Daniel Capella Zanotta
- Vizlab|X-Reality and Geoinformatics Lab, Graduate Programme in Applied Computing, Unisinos University, São Leopoldo RS 93022-750, Brazil; (M.M.); (D.B.); (D.C.Z.); (R.K.H.); (L.S.K.); (M.R.V.); (L.G.J.)
| | - Rafael Kenji Horota
- Vizlab|X-Reality and Geoinformatics Lab, Graduate Programme in Applied Computing, Unisinos University, São Leopoldo RS 93022-750, Brazil; (M.M.); (D.B.); (D.C.Z.); (R.K.H.); (L.S.K.); (M.R.V.); (L.G.J.)
| | - Lucas Silveira Kupssinskü
- Vizlab|X-Reality and Geoinformatics Lab, Graduate Programme in Applied Computing, Unisinos University, São Leopoldo RS 93022-750, Brazil; (M.M.); (D.B.); (D.C.Z.); (R.K.H.); (L.S.K.); (M.R.V.); (L.G.J.)
| | - Maurício Roberto Veronez
- Vizlab|X-Reality and Geoinformatics Lab, Graduate Programme in Applied Computing, Unisinos University, São Leopoldo RS 93022-750, Brazil; (M.M.); (D.B.); (D.C.Z.); (R.K.H.); (L.S.K.); (M.R.V.); (L.G.J.)
| | - Luiz Gonzaga
- Vizlab|X-Reality and Geoinformatics Lab, Graduate Programme in Applied Computing, Unisinos University, São Leopoldo RS 93022-750, Brazil; (M.M.); (D.B.); (D.C.Z.); (R.K.H.); (L.S.K.); (M.R.V.); (L.G.J.)
| | - Caroline Lessio Cazarin
- CENPES-PETROBRAS - Centro de Pesquisas Leopoldo Américo Miguez de Mello, Rio de Janeiro RJ 21941-598, Brazil;
| |
Collapse
|