1
|
Sun Y, Lei L, Guan D, Kuang G, Li Z, Liu L. Locality Preservation for Unsupervised Multimodal Change Detection in Remote Sensing Imagery. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:6955-6969. [PMID: 38809739 DOI: 10.1109/tnnls.2024.3401696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2024]
Abstract
Multimodal change detection (MCD) is a topic of increasing interest in remote sensing. Due to different imaging mechanisms, the multimodal images cannot be directly compared to detect the changes. In this article, we explore the topological structure of multimodal images and construct the links between class relationships (same/different) and change labels (changed/unchanged) of pairwise superpixels, which are imaging modality-invariant. With these links, we formulate the MCD problem within a mathematical framework termed the locality-preserving energy model (LPEM), which is used to maintain the local consistency constraints embedded in the links: the structure consistency based on feature similarity and the label consistency based on spatial continuity. Because the foundation of LPEM, i.e., the links, is intuitively explainable and universal, the proposed method is very robust across different MCD situations. Noteworthy, LPEM is built directly on the label of each superpixel, so it is a paradigm that outputs the change map (CM) directly without the need to generate intermediate difference image (DI) as most previous algorithms have done. Experiments on different real datasets demonstrate the effectiveness of the proposed method. Source code of the proposed method is made available at https://github.com/yulisun/LPEM.
Collapse
|
2
|
Liu T, Zhang M, Gong M, Zhang Q, Jiang F, Zheng H, Lu D. Commonality Feature Representation Learning for Unsupervised Multimodal Change Detection. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; 34:1219-1233. [PMID: 40031527 DOI: 10.1109/tip.2025.3539461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
The main challenge of multimodal change detection (MCD) is that multimodal bitemporal images (MBIs) cannot be compared directly to identify changes. To overcome this problem, this paper proposes a novel commonality feature representation learning (CFRL) and constructs a CFRL-based unsupervised MCD framework. The CFRL is composed of a Siamese-based encoder and two decoders. First, the Siamese-based encoder can map original MBIs in the same feature space for extracting the representative features of each modality. Then, the two decoders are used to reconstruct the original MBIs by regressing themselves, respectively. Meanwhile, we swap the decoders to reconstruct the pseudo-MBIs to conduct modality alignment. Subsequently, all reconstructed images are input to the Siamese-based encoder again to map them in a same feature space, by which representative features are obtained. On this basis, latent commonality features between MBIs can be extracted by minimizing the distance between these representative features. These latent commonality features are comparable and can be used to identify changes. Notably, the proposed CFRL can be performed simultaneously in two modalities corresponding to MBIs. Therefore, two change magnitude images (CMIs) can be generated simultaneously by measuring the difference between the commonality features of MBIs. Finally, a simple threshold algorithm or a clustering algorithm can be employed to divide CMIs into binary change maps. Extensive experiments on six publicly available MCD datasets show that the proposed CFRL-based framework can achieve superior performance compared with other state-of-the-art approaches.
Collapse
|
3
|
Sun Y, Lei L, Guan D, Wu J, Kuang G, Liu L. Image Regression With Structure Cycle Consistency for Heterogeneous Change Detection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:1613-1627. [PMID: 35767492 DOI: 10.1109/tnnls.2022.3184414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Change detection (CD) between heterogeneous images is an increasingly interesting topic in remote sensing. The different imaging mechanisms lead to the failure of homogeneous CD methods on heterogeneous images. To address this challenge, we propose a structure cycle consistency-based image regression method, which consists of two components: the exploration of structure representation and the structure-based regression. We first construct a similarity relationship-based graph to capture the structure information of image; here, a k -selection strategy and an adaptive-weighted distance metric are employed to connect each node with its truly similar neighbors. Then, we conduct the structure-based regression with this adaptively learned graph. More specifically, we transform one image to the domain of the other image via the structure cycle consistency, which yields three types of constraints: forward transformation term, cycle transformation term, and sparse regularization term. Noteworthy, it is not a traditional pixel value-based image regression, but an image structure regression, i.e., it requires the transformed image to have the same structure as the original image. Finally, change extraction can be achieved accurately by directly comparing the transformed and original images. Experiments conducted on different real datasets show the excellent performance of the proposed method. The source code of the proposed method will be made available at https://github.com/yulisun/AGSCC.
Collapse
|
4
|
Lin M, Yang G, Zhang H. Transition Is a Process: Pair-to-Video Change Detection Networks for Very High Resolution Remote Sensing Images. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 32:57-71. [PMID: 37015527 DOI: 10.1109/tip.2022.3226418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
As an important yet challenging task in Earth observation, change detection (CD) is undergoing a technological revolution, given the broadening application of deep learning. Nevertheless, existing deep learning-based CD methods still suffer from two salient issues: 1) incomplete temporal modeling, and 2) space-time coupling. In view of these issues, we propose a more explicit and sophisticated modeling of time and accordingly establish a pair-to-video change detection (P2V-CD) framework. First, a pseudo transition video that carries rich temporal information is constructed from the input image pair, interpreting CD as a problem of video understanding. Then, two decoupled encoders are utilized to spatially and temporally recognize the type of transition, and the encoders are laterally connected for mutual promotion. Furthermore, the deep supervision technique is applied to accelerate the model training. We illustrate experimentally that the P2V-CD method compares favorably to other state-of-the-art CD approaches in terms of both the visual effect and the evaluation metrics, with a moderate model size and relatively lower computational overhead. Extensive feature map visualization experiments demonstrate how our method works beyond making contrasts between bi-temporal images. Source code is available at https://github.com/Bobholamovic/CDLab.
Collapse
|
5
|
Yang M, Jiao L, Liu F, Hou B, Yang S, Jian M. DPFL-Nets: Deep Pyramid Feature Learning Networks for Multiscale Change Detection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:6402-6416. [PMID: 34029198 DOI: 10.1109/tnnls.2021.3079627] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Due to the complementary properties of different types of sensors, change detection between heterogeneous images receives increasing attention from researchers. However, change detection cannot be handled by directly comparing two heterogeneous images since they demonstrate different image appearances and statistics. In this article, we propose a deep pyramid feature learning network (DPFL-Net) for change detection, especially between heterogeneous images. DPFL-Net can learn a series of hierarchical features in an unsupervised fashion, containing both spatial details and multiscale contextual information. The learned pyramid features from two input images make unchanged pixels matched exactly and changed ones dissimilar and after transformed into the same space for each scale successively. We further propose fusion blocks to aggregate multiscale difference images (DIs), generating an enhanced DI with strong separability. Based on the enhanced DI, unchanged areas are predicted and used to train DPFL-Net in the next iteration. In this article, pyramid features and unchanged areas are updated alternately, leading to an unsupervised change detection method. In the feature transformation process, local consistency is introduced to constrain the learned pyramid features, modeling the correlations between the neighboring pixels and reducing the false alarms. Experimental results demonstrate that the proposed approach achieves superior or at least comparable results to the existing state-of-the-art change detection methods in both homogeneous and heterogeneous cases.
Collapse
|
6
|
Wu Y, Li J, Yuan Y, Qin AK, Miao QG, Gong MG. Commonality Autoencoder: Learning Common Features for Change Detection From Heterogeneous Images. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:4257-4270. [PMID: 33600325 DOI: 10.1109/tnnls.2021.3056238] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Change detection based on heterogeneous images, such as optical images and synthetic aperture radar images, is a challenging problem because of their huge appearance differences. To combat this problem, we propose an unsupervised change detection method that contains only a convolutional autoencoder (CAE) for feature extraction and the commonality autoencoder for commonalities exploration. The CAE can eliminate a large part of redundancies in two heterogeneous images and obtain more consistent feature representations. The proposed commonality autoencoder has the ability to discover common features of ground objects between two heterogeneous images by transforming one heterogeneous image representation into another. The unchanged regions with the same ground objects share much more common features than the changed regions. Therefore, the number of common features can indicate changed regions and unchanged regions, and then a difference map can be calculated. At last, the change detection result is generated by applying a segmentation algorithm to the difference map. In our method, the network parameters of the commonality autoencoder are learned by the relevance of unchanged regions instead of the labels. Our experimental results on five real data sets demonstrate the promising performance of the proposed framework compared with several existing approaches.
Collapse
|
7
|
Multimodal Fusion of Mobility Demand Data and Remote Sensing Imagery for Urban Land-Use and Land-Cover Mapping. REMOTE SENSING 2022. [DOI: 10.3390/rs14143370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
This paper aims at exploring the potentiality of the multimodal fusion of remote sensing imagery with information coming from mobility demand data in the framework of land-use mapping in urban areas. After a discussion on the function of mobility demand data, a probabilistic fusion framework is developed to take advantage of remote sensing and transport data, and their joint use for urban land-use and land-cover applications in urban and surrounding areas. Two different methods are proposed within this framework, the first based on pixelwise probabilistic decision fusion and the second on the combination with a region-based multiscale Markov random field. The experimental validation is conducted on a case study associated with the city of Genoa, Italy.
Collapse
|
8
|
Luppino LT, Hansen MA, Kampffmeyer M, Bianchi FM, Moser G, Jenssen R, Anfinsen SN. Code-Aligned Autoencoders for Unsupervised Change Detection in Multimodal Remote Sensing Images. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; PP:60-72. [PMID: 35552141 DOI: 10.1109/tnnls.2022.3172183] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Image translation with convolutional autoencoders has recently been used as an approach to multimodal change detection (CD) in bitemporal satellite images. A main challenge is the alignment of the code spaces by reducing the contribution of change pixels to the learning of the translation function. Many existing approaches train the networks by exploiting supervised information of the change areas, which, however, is not always available. We propose to extract relational pixel information captured by domain-specific affinity matrices at the input and use this to enforce alignment of the code spaces and reduce the impact of change pixels on the learning objective. A change prior is derived in an unsupervised fashion from pixel pair affinities that are comparable across domains. To achieve code space alignment, we enforce pixels with similar affinity relations in the input domains to be correlated also in code space. We demonstrate the utility of this procedure in combination with cycle consistency. The proposed approach is compared with the state-of-the-art machine learning and deep learning algorithms. Experiments conducted on four real and representative datasets show the effectiveness of our methodology.
Collapse
|
9
|
Guo Q, Song H, Fan J, Ai D, Gao Y, Yu X, Yang J. Portal Vein and Hepatic Vein Segmentation in Multi-Phase MR Images Using Flow-Guided Change Detection. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:2503-2517. [PMID: 35275817 DOI: 10.1109/tip.2022.3157136] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Segmenting portal vein (PV) and hepatic vein (HV) from magnetic resonance imaging (MRI) scans is important for hepatic tumor surgery. Compared with single phase-based methods, multiple phases-based methods have better scalability in distinguishing HV and PV by exploiting multi-phase information. However, these methods just coarsely extract HV and PV from different phase images. In this paper, we propose a unified framework to automatically and robustly segment 3D HV and PV from multi-phase MR images, which considers both the change and appearance caused by the vascular flow event to improve segmentation performance. Firstly, inspired by change detection, flow-guided change detection (FGCD) is designed to detect the changed voxels related to hepatic venous flow by generating hepatic venous phase map and clustering the map. The FGCD uniformly deals with HV and PV clustering by the proposed shared clustering, thus making the appearance correlated with portal venous flow robustly delineate without increasing framework complexity. Then, to refine vascular segmentation results produced by both HV and PV clustering, interclass decision making (IDM) is proposed by combining the overlapping region discrimination and neighborhood direction consistency. Finally, our framework is evaluated on multi-phase clinical MR images of the public dataset (TCGA) and local hospital dataset. The quantitative and qualitative evaluations show that our framework outperforms the existing methods.
Collapse
|
10
|
3MRS: An Effective Coarse-to-Fine Matching Method for Multimodal Remote Sensing Imagery. REMOTE SENSING 2022. [DOI: 10.3390/rs14030478] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The fusion of image data from multiple sensors is crucial for many applications. However, there are significant nonlinear intensity deformations between images from different kinds of sensors, leading to matching failure. To address this need, this paper proposes an effective coarse-to-fine matching method for multimodal remote sensing images (3MRS). In the coarse matching stage, feature points are first detected on a maximum moment map calculated with a phase congruency model. Then, feature description is conducted using an index map constructed by finding the index of the maximum value in all orientations of convolved images obtained using a set of log-Gabor filters. At last, several matches are built through image matching and outlier removal, which can be used to estimate a reliable affine transformation model between the images. In the stage of fine matching, we develop a novel template matching method based on the log-Gabor convolution image sequence and match the template features with a 3D phase correlation matching strategy, given that the initial correspondences are achieved with the estimated transformation. Results show that compared with SIFT, and three state-of-the-art methods designed for multimodal image matching, PSO-SIFT, HAPCG, and RIFT, only 3MRS successfully matched all six types of multimodal remote sensing image pairs: optical–optical, optical–infrared, optical–depth, optical–map, optical–SAR, and day–night, with each including ten different image pairs. On average, the number of correct matches (NCM) of 3MRS was 164.47, 123.91, 4.88, and 4.33 times that of SIFT, PSO-SIFT, HAPCG, and RIFT for the successfully matched image pairs of each method. In terms of accuracy, the root-mean-square error of correct matches for 3MRS, SIFT, PSO-SIFT, HAPCG, and RIFT are 1.47, 1.98, 1.79, 2.83, and 2.45 pixels, respectively, revealing that 3MRS got the highest accuracy. Even though the total running time of 3MRS was the longest, the efficiency for obtaining one correct match is the highest considering the most significant number of matches. The source code of 3MRS and the experimental datasets and detailed results are publicly available.
Collapse
|
11
|
Attention-Guided Siamese Fusion Network for Change Detection of Remote Sensing Images. REMOTE SENSING 2021. [DOI: 10.3390/rs13224597] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Change detection for remote sensing images is an indispensable procedure for many remote sensing applications, such as geological disaster assessment, environmental monitoring, and urban development monitoring. Through this technique, the difference in certain areas after some emergencies can be determined to estimate their influence. Additionally, by analyzing the sequential difference maps, the change tendency can be found to help to predict future changes, such as urban development and environmental pollution. The complex variety of changes and interferential changes caused by imaging processing, such as season, weather and sensors, are critical factors that affect the effectiveness of change detection methods. Recently, there have been many research achievements surrounding this topic, but a perfect solution to all the problems in change detection has not yet been achieved. In this paper, we mainly focus on reducing the influence of imaging processing through the deep neural network technique with limited labeled samples. The attention-guided Siamese fusion network is constructed based on one basic Siamese network for change detection. In contrast to common processing, besides high-level feature fusion, feature fusion is operated during the whole feature extraction process by using an attention information fusion module. This module can not only realize the information fusion of two feature extraction network branches, but also guide the feature learning network to focus on feature channels with high importance. Finally, extensive experiments were performed on three public datasets, which could verify the significance of information fusion and the guidance of the attention mechanism during feature learning in comparison with related methods.
Collapse
|
12
|
Exploiting High Geopositioning Accuracy of SAR Data to Obtain Accurate Geometric Orientation of Optical Satellite Images. REMOTE SENSING 2021. [DOI: 10.3390/rs13173535] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Accurate geopositioning of optical satellite imagery is a fundamental step for many photogrammetric applications. Considering the imaging principle and data processing manner, SAR satellites can achieve high geopositioning accuracy. Therefore, SAR data can be a reliable source for providing control information in the orientation of optical satellite images. This paper proposes a practical solution for an accurate orientation of optical satellite images using SAR reference images to take advantage of the merits of SAR data. Firstly, we propose an accurate and robust multimodal image matching method to match the SAR and optical satellite images. This approach includes the development of a new structural-based multimodal applicable feature descriptor that employs angle-weighted oriented gradients (AWOGs) and the utilization of a three-dimensional phase correlation similarity measure. Secondly, we put forward a general optical satellite imagery orientation framework based on multiple SAR reference images, which uses the matches of the SAR and optical satellite images as virtual control points. A large number of experiments not only demonstrate the superiority of the proposed matching method compared to the state-of-the-art methods but also prove the effectiveness of the proposed orientation framework. In particular, the matching performance is improved by about 17% compared with the latest multimodal image matching method, namely, CFOG, and the geopositioning accuracy of optical satellite images is improved, from more than 200 to around 8 m.
Collapse
|
13
|
Sun Y, Lei L, Guan D, Kuang G. Iterative Robust Graph for Unsupervised Change Detection of Heterogeneous Remote Sensing Images. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:6277-6291. [PMID: 34232875 DOI: 10.1109/tip.2021.3093766] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
This work presents a robust graph mapping approach for the unsupervised heterogeneous change detection problem in remote sensing imagery. To address the challenge that heterogeneous images cannot be directly compared due to different imaging mechanisms, we take advantage of the fact that the heterogeneous images share the same structure information for the same ground object, which is imaging modality-invariant. The proposed method first constructs a robust K -nearest neighbor graph to represent the structure of each image, and then compares the graphs within the same image domain by means of graph mapping to calculate the forward and backward difference images, which can avoid the confusion of heterogeneous data. Finally, it detects the changes through a Markovian co-segmentation model that can fuse the forward and backward difference images in the segmentation process, which can be solved by the co-graph cut. Once the changed areas are detected by the Markovian co-segmentation, they will be propagated back into the graph construction process to reduce the influence of changed neighbors. This iterative framework makes the graph more robust and thus improves the final detection performance. Experimental results on different data sets confirm the effectiveness of the proposed method. Source code of the proposed method is made available at https://github.com/yulisun/IRG-McS.
Collapse
|
14
|
Unsupervised Change Detection Using Spectrum-Trend and Shape Similarity Measure. REMOTE SENSING 2020. [DOI: 10.3390/rs12213606] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The emergence of very high resolution (VHR) images contributes to big challenges in change detection. It is hard for traditional pixel-level approaches to achieve satisfying performance due to radiometric difference. This work proposes a novel feature descriptor that is based on spectrum-trend and shape context for VHR remote sensing images. The proposed method is mainly composed of two aspects. The spectrum-trend graph is generated first, and then the shape context is applied in order to describe the shape of spectrum-trend. By constructing spectrum-trend graph, spatial and spectral information is integrated effectively. The approach is performed and assessed by QuickBird and SPOT-5 satellite images. The quantitative analysis of comparative experiments proves the effectiveness of the proposed technique in dealing with the radiometric difference and improving the accuracy of change detection. The results indicate that the overall accuracy and robustness are both boosted. Moreover, this work provides a novel viewpoint for discriminating changed and unchanged pixels by comparing the shape similarity of local spectrum-trend.
Collapse
|
15
|
Graph-Based Data Fusion Applied to: Change Detection and Biomass Estimation in Rice Crops. REMOTE SENSING 2020. [DOI: 10.3390/rs12172683] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The complementary nature of different modalities and multiple bands used in remote sensing data is helpful for tasks such as change detection and the prediction of agricultural variables. Nonetheless, correctly processing a multi-modal dataset is not a simple task, owing to the presence of different data resolutions and formats. In the past few years, graph-based methods have proven to be a useful tool in capturing inherent data similarity, in spite of different data formats, and preserving relevant topological and geometric information. In this paper, we propose a graph-based data fusion algorithm for remotely sensed images applied to (i) data-driven semi-unsupervised change detection and (ii) biomass estimation in rice crops. In order to detect the change, we evaluated the performance of four competing algorithms on fourteen datasets. To estimate biomass in rice crops, we compared our proposal in terms of root mean squared error (RMSE) concerning a recent approach based on vegetation indices as features. The results confirm that the proposed graph-based data fusion algorithm outperforms state-of-the-art methods for change detection and biomass estimation in rice crops.
Collapse
|