1
|
Sun Y, Lei L, Guan D, Kuang G, Li Z, Liu L. Locality Preservation for Unsupervised Multimodal Change Detection in Remote Sensing Imagery. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:6955-6969. [PMID: 38809739 DOI: 10.1109/tnnls.2024.3401696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2024]
Abstract
Multimodal change detection (MCD) is a topic of increasing interest in remote sensing. Due to different imaging mechanisms, the multimodal images cannot be directly compared to detect the changes. In this article, we explore the topological structure of multimodal images and construct the links between class relationships (same/different) and change labels (changed/unchanged) of pairwise superpixels, which are imaging modality-invariant. With these links, we formulate the MCD problem within a mathematical framework termed the locality-preserving energy model (LPEM), which is used to maintain the local consistency constraints embedded in the links: the structure consistency based on feature similarity and the label consistency based on spatial continuity. Because the foundation of LPEM, i.e., the links, is intuitively explainable and universal, the proposed method is very robust across different MCD situations. Noteworthy, LPEM is built directly on the label of each superpixel, so it is a paradigm that outputs the change map (CM) directly without the need to generate intermediate difference image (DI) as most previous algorithms have done. Experiments on different real datasets demonstrate the effectiveness of the proposed method. Source code of the proposed method is made available at https://github.com/yulisun/LPEM.
Collapse
|
2
|
Liu T, Zhang M, Gong M, Zhang Q, Jiang F, Zheng H, Lu D. Commonality Feature Representation Learning for Unsupervised Multimodal Change Detection. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; 34:1219-1233. [PMID: 40031527 DOI: 10.1109/tip.2025.3539461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
The main challenge of multimodal change detection (MCD) is that multimodal bitemporal images (MBIs) cannot be compared directly to identify changes. To overcome this problem, this paper proposes a novel commonality feature representation learning (CFRL) and constructs a CFRL-based unsupervised MCD framework. The CFRL is composed of a Siamese-based encoder and two decoders. First, the Siamese-based encoder can map original MBIs in the same feature space for extracting the representative features of each modality. Then, the two decoders are used to reconstruct the original MBIs by regressing themselves, respectively. Meanwhile, we swap the decoders to reconstruct the pseudo-MBIs to conduct modality alignment. Subsequently, all reconstructed images are input to the Siamese-based encoder again to map them in a same feature space, by which representative features are obtained. On this basis, latent commonality features between MBIs can be extracted by minimizing the distance between these representative features. These latent commonality features are comparable and can be used to identify changes. Notably, the proposed CFRL can be performed simultaneously in two modalities corresponding to MBIs. Therefore, two change magnitude images (CMIs) can be generated simultaneously by measuring the difference between the commonality features of MBIs. Finally, a simple threshold algorithm or a clustering algorithm can be employed to divide CMIs into binary change maps. Extensive experiments on six publicly available MCD datasets show that the proposed CFRL-based framework can achieve superior performance compared with other state-of-the-art approaches.
Collapse
|
3
|
Zhang YX, Gui J, Kwok JTY. Constructing Diverse Inlier Consistency for Partial Point Cloud Registration. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:6535-6549. [PMID: 39531562 DOI: 10.1109/tip.2024.3492700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2024]
Abstract
Partial point cloud registration aims to align partial scans into a shared coordinate system. While learning-based partial point cloud registration methods have achieved remarkable progress, they often fail to take full advantage of the relative positional relationships both within (intra-) and between (inter-) point clouds. This oversight hampers their ability to accurately identify overlapping regions and search for reliable correspondences. To address these limitations, a diverse inlier consistency (DIC) method has been proposed that adaptively embeds the positional information of a reliable correspondence in the intra- and inter-point cloud. Firstly, a diverse inlier consistency-driven region perception (DICdRP) module is devised, which encodes the positional information of the selected correspondence within the intra-point cloud. This module enhances the sensitivity of all points to overlapping regions by recognizing the position of the selected correspondence. Secondly, a diverse inlier consistency-aware correspondence search (DICaCS) module is developed, which leverages relative positions in the inter-point cloud. This module studies an inter-point cloud DIC weight to supervise correspondence compatibility, allowing for precise identification of correspondences and effective outlier filtration. Thirdly, diverse information is integrated throughout our framework to achieve a more holistic and detailed registration process. Extensive experiments on object-level and scene-level datasets demonstrate the superior performance of the proposed algorithm. The code is available at https://github.com/yxzhang15/DIC.
Collapse
|
4
|
Zhang M, Gao T, Gong M, Zhu S, Wu Y, Li H. Semisupervised Change Detection Based on Bihierarchical Feature Aggregation and Extraction Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:10488-10502. [PMID: 37022855 DOI: 10.1109/tnnls.2023.3242075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
With the rapid development of remote sensing (RS) technology, high-resolution RS image change detection (CD) has been widely used in many applications. Pixel-based CD techniques are maneuverable and widely used, but vulnerable to noise interference. Object-based CD techniques can effectively utilize the abundant spectrum, texture, shape, and spatial information but easy-to-ignore details of RS images. How to combine the advantages of pixel-based methods and object-based methods remains a challenging problem. Besides, although supervised methods have the capability to learn from data, the true labels representing changed information of RS images are often hard to obtain. To address these issues, this article proposes a novel semisupervised CD framework for high-resolution RS images, which employs small amounts of true labeled data and a lot of unlabeled data to train the CD network. A bihierarchical feature aggregation and extraction network (BFAEN) is designed to achieve the pixelwise together with objectwise feature concatenation feature representation for the comprehensive utilization of the two-level features. In order to alleviate the coarseness and insufficiency of labeled samples, a confident learning algorithm is used to eliminate noisy labels and a novel loss function is designed for training the model using true- and pseudo-labels in a semisupervised fashion. Experimental results on real datasets demonstrate the effectiveness and superiority of the proposed method.
Collapse
|
5
|
Sun Y, Lei L, Guan D, Wu J, Kuang G, Liu L. Image Regression With Structure Cycle Consistency for Heterogeneous Change Detection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:1613-1627. [PMID: 35767492 DOI: 10.1109/tnnls.2022.3184414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Change detection (CD) between heterogeneous images is an increasingly interesting topic in remote sensing. The different imaging mechanisms lead to the failure of homogeneous CD methods on heterogeneous images. To address this challenge, we propose a structure cycle consistency-based image regression method, which consists of two components: the exploration of structure representation and the structure-based regression. We first construct a similarity relationship-based graph to capture the structure information of image; here, a k -selection strategy and an adaptive-weighted distance metric are employed to connect each node with its truly similar neighbors. Then, we conduct the structure-based regression with this adaptively learned graph. More specifically, we transform one image to the domain of the other image via the structure cycle consistency, which yields three types of constraints: forward transformation term, cycle transformation term, and sparse regularization term. Noteworthy, it is not a traditional pixel value-based image regression, but an image structure regression, i.e., it requires the transformed image to have the same structure as the original image. Finally, change extraction can be achieved accurately by directly comparing the transformed and original images. Experiments conducted on different real datasets show the excellent performance of the proposed method. The source code of the proposed method will be made available at https://github.com/yulisun/AGSCC.
Collapse
|
6
|
Li H, Wu B, Sun M, Ye Y, Zhu Z, Chen K. Multi-view graph neural network with cascaded attention for lncRNA-miRNA interaction prediction. Knowl Based Syst 2023. [DOI: 10.1016/j.knosys.2023.110492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/30/2023]
|
7
|
Zhang P, Chen J, Che C, Zhang L, Jin B, Zhu Y. IEA-GNN: Anchor-aware Graph Neural Network Fused with Information Entropy for Node Classification and Link Prediction. Inf Sci (N Y) 2023. [DOI: 10.1016/j.ins.2023.03.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2023]
|
8
|
Han B, Wei Y, Wang Q, Wan S. CoLM
2
S: Contrastive self‐supervised learning on attributed multiplex graph network with multi‐scale information. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY 2023. [DOI: 10.1049/cit2.12168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Affiliation(s)
- Beibei Han
- College of Systems Engineering National University of Defense Technology Changsha Hunan China
| | - Yingmei Wei
- College of Systems Engineering National University of Defense Technology Changsha Hunan China
| | - Qingyong Wang
- College of Systems Engineering National University of Defense Technology Changsha Hunan China
| | - Shanshan Wan
- College of Systems Engineering National University of Defense Technology Changsha Hunan China
| |
Collapse
|
9
|
Graph Neural Networks Induced by Concept Lattices for Classification. Int J Approx Reason 2023. [DOI: 10.1016/j.ijar.2023.01.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
10
|
Li M, Chen S, Yang W, Wang Q. Multi-Stream Graph Convolutional Networks for Text Classification via Representative-Word Document Mining. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS 2022. [DOI: 10.1142/s1469026822500286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Recently, graph convolutional networks (GCNs) for text classification have received considerable attention in natural language processing. However, most current methods just use original documents and words in the corpus to construct the topology of graph which may lose some effective information. In this paper, we propose a Multi-Stream Graph Convolutional Network (MS-GCN) for text classification via Representative-Word Document (RWD) mining, which is implemented in PyTorch. In the proposed method, we first introduce temporary labels and mine the RWDs which are treated as additional documents in the corpus. Then, we build a heterogeneous graph based on relations among a Group of RWDs (GRWDs), words and original documents. Furthermore, we construct the MS-GCN based on multiple heterogeneous graphs according to different GRWDs. Finally, we optimize our MS-GCN model through updated mechanism of GRWDs. We evaluate the proposed approach on six text classification datasets, 20NG, R8, R52, Ohsumed, MR and Pheme. Extensive experiments on these datasets show that our proposed approach outperforms state-of-the-art methods for text classification.
Collapse
Affiliation(s)
- Meng Li
- College of Mathematics and Statistic, Hebei University of Economics and Business, Hebei, P. R. China
| | - Shenyu Chen
- College of Mathematics and Statistic, Hebei University of Economics and Business, Hebei, P. R. China
| | | | - Qianying Wang
- College of Mathematics and Statistic, Hebei University of Economics and Business, Hebei, P. R. China
| |
Collapse
|
11
|
Zhao L, Liu C, Qu H. Transmission Line Object Detection Method Based on Contextual Information Enhancement and Joint Heterogeneous Representation. SENSORS (BASEL, SWITZERLAND) 2022; 22:6855. [PMID: 36146204 PMCID: PMC9500743 DOI: 10.3390/s22186855] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Revised: 09/07/2022] [Accepted: 09/08/2022] [Indexed: 06/16/2023]
Abstract
Transmission line inspection plays an important role in maintaining power security. In the object detection of the transmission line, the large-scale gap of the fittings is still a main and negative factor in affecting the detection accuracy. In this study, an optimized method is proposed based on the contextual information enhancement (CIE) and joint heterogeneous representation (JHR). In the high-resolution feature extraction layer of the Swin transformer, the convolution is added in the part of the self-attention calculation, which can enhance the contextual information features and improve the feature extraction ability for small objects. Moreover, in the detection head, the joint heterogeneous representations of different detection methods are combined to enhance the features of classification and localization tasks, which can improve the detection accuracy of small objects. The experimental results show that this optimized method has a good detection performance on the small-sized and obscured objects in the transmission line. The total mAP (mean average precision) of the detected objects by this optimized method is increased by 5.8%, and in particular, the AP of the normal pin is increased by 18.6%. The improvement of the accuracy of the transmission line object detection method lays a foundation for further real-time inspection.
Collapse
Affiliation(s)
- Lijuan Zhao
- School of Control and Computer Engineering, North China Electric Power University, Beijing 102206, China
| | - Chang’an Liu
- School of Information, North China University of Technology, Beijing 100144, China
| | - Hongquan Qu
- School of Information, North China University of Technology, Beijing 100144, China
| |
Collapse
|
12
|
|
13
|
MSGATN: A Superpixel-Based Multi-Scale Siamese Graph Attention Network for Change Detection in Remote Sensing Images. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12105158] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
With the rapid development of Earth observation technology, how to effectively and efficiently detect changes in multi-temporal images has become an important but challenging problem. Relying on the advantages of high performance and robustness, object-based change detection (CD) has become increasingly popular. By analyzing the similarity of local pixels, object-based CD aggregates similar pixels into one object and takes it as the basic processing unit. However, object-based approaches often have difficulty capturing discriminative features, as irregular objects make processing difficult. To address this problem, in this paper, we propose a novel superpixel-based multi-scale Siamese graph attention network (MSGATN) which can process unstructured data natively and extract valuable features. First, a difference image (DI) is generated by Euclidean distance between bitemporal images. Second, superpixel segmentation is employed based on DI to divide each image into many homogeneous regions. Then, these superpixels are used to model the problem by graph theory to construct a series of nodes with the adjacency between them. Subsequently, the multi-scale neighborhood features of the nodes are extracted through applying a graph convolutional network and concatenated by an attention mechanism. Finally, the binary change map can be obtained by classifying each node by some fully connected layers. The novel features of MSGATN can be summarized as follows: (1) Training in multi-scale constructed graphs improves the recognition over changed land cover of varied sizes and shapes. (2) Spectral and spatial self-attention mechanisms are exploited for a better change detection performance. The experimental results on several real datasets show the effectiveness and superiority of the proposed method. In addition, compared to other recent methods, the proposed can demonstrate very high processing efficiency and greatly reduce the dependence on labeled training samples in a semisupervised training fashion.
Collapse
|
14
|
Few-Shot Learning with Collateral Location Coding and Single-Key Global Spatial Attention for Medical Image Classification. ELECTRONICS 2022. [DOI: 10.3390/electronics11091510] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Humans are born with the ability to learn quickly by discerning objects from a few samples, to acquire new skills in a short period of time, and to make decisions based on limited prior experience and knowledge. The existing deep learning models for medical image classification often rely on a large number of labeled training samples, whereas the fast learning ability of deep neural networks has failed to develop. In addition, it requires a large amount of time and computing resource to retrain the model when the deep model encounters classes it has never seen before. However, for healthcare applications, enabling a model to generalize new clinical scenarios is of great importance. The existing image classification methods cannot explicitly use the location information of the pixel, making them insensitive to cues related only to the location. Besides, they also rely on local convolution and cannot properly utilize global information, which is essential for image classification. To alleviate these problems, we propose a collateral location coding to help the network explicitly exploit the location information of each pixel to make it easier for the network to recognize cues related to location only, and a single-key global spatial attention is designed to make the pixels at each location perceive the global spatial information in a low-cost way. Experimental results on three medical image benchmark datasets demonstrate that our proposed algorithm outperforms the state-of-the-art approaches in both effectiveness and generalization ability.
Collapse
|
15
|
An Adaptive Surrogate-Assisted Endmember Extraction Framework Based on Intelligent Optimization Algorithms for Hyperspectral Remote Sensing Images. REMOTE SENSING 2022. [DOI: 10.3390/rs14040892] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
As the foremost step of spectral unmixing, endmember extraction has been one of the most challenging techniques in the spectral unmixing processing due to the mixing of pixels and the complexity of hyperspectral remote sensing images. The existing geometrial-based endmember extraction algorithms have achieved the ideal results, but most of these algorithms perform poorly when they do not meet the assumption of simplex structure. Recently, many intelligent optimization algorithms have been employed to solve the problem of endmember extraction. Although they achieved the better performance than the geometrial-based algorithms in different complex scenarios, they also suffer from the time-consuming problem. In order to alleviate the above problems, balance the two key indicators of accuracy and running time, an adaptive surrogate-assisted endmember extraction (ASAEE) framework based on intelligent optimization algorithms is proposed for hyperspectral remote sensing images in this paper. In the proposed framework, the surrogate-assisted model is established to reduce the expensive time cost of the intelligent algorithms by fitting the fully constrained evaluation value with the low-cost estimated value. In more detail, three commonly used intelligent algorithms, namely genetic algorithm, particle swarm optimization algorithm and differential evolution algorithm, are specifically designed into the ASAEE framework to verify the effectiveness and robustness. In addition, an adaptive weight surrogate-assisted model selection strategy is proposed, which can automatically adjust the weights of different surrogate models according to the characteristics of different intelligent algorithms. Experimental results on three data sets (including two simulated data sets and one real data set) show the effectiveness and the excellent performance of the proposed ASAEE framework.
Collapse
|
16
|
A Novel Power-Saving Reversing Camera System with Artificial Intelligence Object Detection. ELECTRONICS 2022. [DOI: 10.3390/electronics11020282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
According to a study by the Insurance Institute for Highway Safety (IIHS), the driving collision rate of using only the reversing camera system is lower than that of using both the reversing camera system and the reversing radar. In this article, we implemented a reversing camera system with artificial intelligence object detection to increase the information of the reversing image. Our system consists of an image processing chip (IPC) with wide-angle image distortion correction and an image buffer controller, a low-power KL520 chip and an optimized artificial intelligence model MobileNetV2-YOLOV3-Optimized (MNYLO). The results of the experiment show the three advantages of our system. Firstly, through the image distortion correction of IPC, we can restore the distorted reversing image. Secondly, by using a public dataset and collected images of various weathers for artificial intelligence model training, our system does not need to use image algorithms that eliminate bad weathers such as rain, fog, and snow to restore polluted images. Objects can still be detected by our system in images contaminated by weather. Thirdly, compared with the AI model Tiny_YOLOV3, not only the parameters of our MNYLO have been reduced by 72.3%, the amount of calculation has been reduced by 86.4%, but the object detection rate has also been maintained and avoided sharp drops.
Collapse
|