1
|
Dang J, Zheng H, Xu X, Wang L, Hu Q, Guo Y. Adaptive Sparse Memory Networks for Efficient and Robust Video Object Segmentation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:3820-3833. [PMID: 38315589 DOI: 10.1109/tnnls.2024.3357118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2024]
Abstract
Recently, memory-based networks have achieved promising performance for video object segmentation (VOS). However, existing methods still suffer from unsatisfactory segmentation accuracy and inferior efficiency. The reasons are mainly twofold: 1) during memory construction, the inflexible memory storage mechanism results in a weak discriminative ability for similar appearances in complex scenarios, leading to video-level temporal redundancy, and 2) during memory reading, matching robustness and memory retrieval accuracy decrease as the number of video frames increases. To address these challenges, we propose an adaptive sparse memory network (ASM) that efficiently and effectively performs VOS by sparsely leveraging previous guidance while attending to key information. Specifically, we design an adaptive sparse memory constructor (ASMC) to adaptively memorize informative past frames according to dynamic temporal changes in video frames. Furthermore, we introduce an attentive local memory reader (ALMR) to quickly retrieve relevant information using a subset of memory, thereby reducing frame-level redundant computation and noise in a simpler and more convenient manner. To prevent key features from being discarded by the subset of memory, we further propose a novel attentive local feature aggregation (ALFA) module, which preserves useful cues by selectively aggregating discriminative spatial dependence from adjacent frames, thereby effectively increasing the receptive field of each memory frame. Extensive experiments demonstrate that our model achieves state-of-the-art performance with real-time speed on six popular VOS benchmarks. Furthermore, our ASM can be applied to existing memory-based methods as generic plugins to achieve significant performance improvements. More importantly, our method exhibits robustness in handling sparse videos with low frame rates.
Collapse
|
2
|
Wang M, Wei S, Zhou Z, Shi J, Zhang X, Guo Y. CTV-Net: Complex-Valued TV-Driven Network With Nested Topology for 3-D SAR Imaging. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:5588-5602. [PMID: 36178996 DOI: 10.1109/tnnls.2022.3208252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
The regularization-based approaches offer promise in improving synthetic aperture radar (SAR) imaging quality while reducing system complexity. However, the widely applied l1 regularization model is hindered by their hypothesis of inherent sparsity, causing unreal estimations of surface-like targets. Inspired by the edge-preserving property of total variation (TV), we propose a new complex-valued TV (CTV)-driven interpretable neural network with nested topology, i.e., CTV-Net, for 3-D SAR imaging. In our scheme, based on the 2-D holography imaging operator, the CTV-driven optimization model is constructed to pursue precise estimations in weakly sparse scenarios. Subsequently, a nested algorithmic framework, i.e., complex-valued TV-driven fast iterative shrinkage thresholding (CTV-FIST), is derived from the theory of proximal gradient descent (PGD) and FIST algorithm, theoretically supporting the design of CTV-Net. In CTV-Net, the trainable weights are layer-varied and functionally relevant to the hyperparameters of CTV-FIST, which aims to constrain the algorithmic parameters to update in a well-conditioned tendency. All weights are learned by end-to-end training based on a two-term cost function, which bounds the measurement fidelity and TV norm simultaneously. Under the guidance of the SAR signal model, a reasonably sized training set is generated, by randomly selecting reference images from the MNIST set and consequently synthesizing complex-valued label signals. Finally, the methodology is validated, numerically and visually, by extensive SAR simulations and real-measured experiments, and the results demonstrate the viability and efficiency of the proposed CTV-Net in the cases of recovering 3-D SAR images from incomplete echoes.
Collapse
|
3
|
Pu W, Bao Y. RPCA-AENet: Clutter Suppression and Simultaneous Stationary Scene and Moving Targets Imaging in the Presence of Motion Errors. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:2339-2352. [PMID: 35969541 DOI: 10.1109/tnnls.2022.3189997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Clutter suppression and ground moving target imaging in synthetic aperture radar (SAR) system have been receiving increasing attention for both civilian and military applications. The problem of clutter suppression and ground moving target imaging in practical applications is much more challenging due to the motion error of the radar platform. In this article, we focus on the problems of clutter suppression and simultaneous stationary and moving target imaging in the presence of motion errors. Specifically, we propose a robust principal component analysis autoencoder network (RPCA-AENet) in a single-channel SAR system. In RPCA-AENet, the encoder transforms the SAR echo into imaging results of stationary scene and ground moving targets, and the decoder regenerates the SAR echo using the obtained imaging results. The encoder is designed by the unfolded robust principal component analysis (RPCA), while the decoder is formulated into two dense layers and one additional layer. Joint reconstruction loss, entropy loss, and measurement distance loss are utilized to guide the training of the RPCA-AENet. Notably, the algorithm operates in a totally self-supervised form and requires no other labeled SAR data. The methodology was tested on numerical SAR data. These tests show that the proposed architecture outperforms other state-of-the-art methods.
Collapse
|
4
|
Pu W. Shuffle GAN With Autoencoder: A Deep Learning Approach to Separate Moving and Stationary Targets in SAR Imagery. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:4770-4784. [PMID: 33684045 DOI: 10.1109/tnnls.2021.3060747] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Synthetic aperture radar (SAR) has been widely applied in both civilian and military fields because it provides high-resolution images of the ground target regardless of weather conditions, day or night. In SAR imaging, the separation of moving and stationary targets is of great significance as it is capable of removing the ambiguity stemming from inevitable moving targets in stationary scene imaging and suppressing clutter in moving target imaging. The newly emerged generative adversarial networks (GANs) have great performance in many other signal processing areas; however, they have not been introduced to radar imaging tasks. In this work, we propose a novel shuffle GAN with autoencoder separation method to separate the moving and stationary targets in SAR imagery. The proposed algorithm is based on the independence of well-focused stationary targets and blurred moving targets for creating adversarial constraints. Note that the algorithm operates in a totally unsupervised fashion without requiring a sample set that contains mixed and separated SAR images. Experiments are carried out on synthetic and real SAR data to validate the effectiveness of the proposed method.
Collapse
|
5
|
Hou J, Zhang F, Qiu H, Wang J, Wang Y, Meng D. Robust Low-Tubal-Rank Tensor Recovery From Binary Measurements. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:4355-4373. [PMID: 33656988 DOI: 10.1109/tpami.2021.3063527] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Low-rank tensor recovery (LRTR) is a natural extension of low-rank matrix recovery (LRMR) to high-dimensional arrays, which aims to reconstruct an underlying tensor X from incomplete linear measurements [Formula: see text]. However, LRTR ignores the error caused by quantization, limiting its application when the quantization is low-level. In this work, we take into account the impact of extreme quantization and suppose the quantizer degrades into a comparator that only acquires the signs of [Formula: see text]. We still hope to recover X from these binary measurements. Under the tensor Singular Value Decomposition (t-SVD) framework, two recovery methods are proposed-the first is a tensor hard singular tube thresholding method; the second is a constrained tensor nuclear norm minimization method. These methods can recover a real n1×n2×n3 tensor X with tubal rank r from m random Gaussian binary measurements with errors decaying at a polynomial speed of the oversampling factor λ:=m/((n1+n2)n3r). To improve the convergence rate, we develop a new quantization scheme under which the convergence rate can be accelerated to an exponential function of λ. Numerical experiments verify our results, and the applications to real-world data demonstrate the promising performance of the proposed methods.
Collapse
|
6
|
Baek S, Jung Y, Lee S. Signal Expansion Method in Indoor FMCW Radar Systems for Improving Range Resolution. SENSORS 2021; 21:s21124226. [PMID: 34203035 PMCID: PMC8235730 DOI: 10.3390/s21124226] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 06/10/2021] [Accepted: 06/18/2021] [Indexed: 11/16/2022]
Abstract
As various unmanned autonomous driving technologies such as autonomous vehicles and autonomous driving drones are being developed, research on FMCW radar, a sensor related to these technologies, is actively being conducted. The range resolution, which is a parameter for accurately detecting an object in the FMCW radar system, depends on the modulation bandwidth. Expensive radars have a large modulation bandwidth, use the band above 77 GHz, and are mainly used as in-vehicle radar sensors. However, these high-performance radars have the disadvantage of being expensive and burdensome for use in areas that require precise sensors, such as indoor environment motion detection and autonomous drones. In this paper, the range resolution is improved beyond the limited modulation bandwidth by extending the beat frequency signal in the time domain through the proposed Adaptive Mirror Padding and Phase Correction Padding. The proposed algorithm has similar performance in the existing Zero Padding, Mirror Padding, and Range RMSE, but improved results were confirmed through the ρs indicating the size of the side lobe compared to the main lobe and the accurate detection rate of the OS CFAR. In the case of ρs, it was confirmed that with single targets, Adaptive Mirror Padding was improved by about 3 times and Phase Correct Padding was improved by about 6 times compared to the existing algorithm. The results of the OS CFAR were divided into single targets and multiple targets to confirm the performance. In single targets, Adaptive Mirror Padding improved by about 10% and Phase Correct Padding by about 20% compared to the existing algorithm. In multiple targets, Phase Correct Padding improved by about 20% compared to the existing algorithm. The proposed algorithm was verified through the MATLAB Tool and the actual FMCW radar. As the results were similar in the two experimental environments, it was verified that the algorithm works in real radar as well.
Collapse
Affiliation(s)
- Seongmin Baek
- Department of Information and Communication Engineering & Convergence Engineering for Intelligent Drone, Sejong University, Seoul 05006, Korea;
| | - Yunho Jung
- Department of Smart Drone Convergence, School of Electronics and Information Engineering, Aerospace University, Goyang-si 10540, Korea;
| | - Seongjoo Lee
- Department of Information and Communication Engineering & Convergence Engineering for Intelligent Drone, Sejong University, Seoul 05006, Korea;
- Correspondence:
| |
Collapse
|
7
|
Abstract
SAR image registration is a crucial problem in SAR image processing since the registration results with high precision are conducive to improving the quality of other problems, such as change detection of SAR images. Recently, for most DL-based SAR image registration methods, the problem of SAR image registration has been regarded as a binary classification problem with matching and non-matching categories to construct the training model, where a fixed scale is generally set to capture pair image blocks corresponding to key points to generate the training set, whereas it is known that image blocks with different scales contain different information, which affects the performance of registration. Moreover, the number of key points is not enough to generate a mass of class-balance training samples. Hence, we proposed a new method of SAR image registration that meanwhile utilizes the information of multiple scales to construct the matching models. Specifically, considering that the number of training samples is small, deep forest was employed to train multiple matching models. Moreover, a multi-scale fusion strategy is proposed to integrate the multiple predictions and obtain the best pair matching points between the reference image and the sensed image. Finally, experimental results on four datasets illustrate that the proposed method is better than the compared state-of-the-art methods, and the analyses for different scales also indicate that the fusion of multiple scales is more effective and more robust for SAR image registration than one single fixed scale.
Collapse
|
8
|
Complex-Valued Pix2pix—Deep Neural Network for Nonlinear Electromagnetic Inverse Scattering. ELECTRONICS 2021. [DOI: 10.3390/electronics10060752] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Nonlinear electromagnetic inverse scattering is an imaging technique with quantitative reconstruction and high resolution. Compared with conventional tomography, it takes into account the more realistic interaction between the internal structure of the scene and the electromagnetic waves. However, there are still open issues and challenges due to its inherent strong non-linearity, ill-posedness and computational cost. To overcome these shortcomings, we apply an image translation network, named as Complex-Valued Pix2pix, on the inverse scattering problem of electromagnetic field. Complex-Valued Pix2pix includes two parts of Generator and Discriminator. The Generator employs a multi-layer complex valued convolutional neural network, while the Discriminator computes the maximum likelihoods between the original value and the reconstructed value from the aspects of the two parts of the complex: real part and imaginary part, respectively. The results show that the Complex-Valued Pix2pix can learn the mapping from the initial contrast to the real contrast in microwave imaging models. Moreover, due to the introduction of discriminator, Complex-Valued Pix2pix can capture more features of nonlinearity than traditional Convolutional Neural Network (CNN) by confrontation training. Therefore, without considering the time cost of training, Complex-Valued Pix2pix may be a more effective way to solve inverse scattering problems than other deep learning methods. The main improvement of this work lies in the realization of a Generative Adversarial Network (GAN) in the electromagnetic inverse scattering problem, adding a discriminator to the traditional Convolutional Neural Network (CNN) method to optimize network training. It has the prospect of outperforming conventional methods in terms of both the image quality and computational efficiency.
Collapse
|
9
|
Pu W. Deep SAR Imaging and Motion Compensation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:2232-2247. [PMID: 33471760 DOI: 10.1109/tip.2021.3051484] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Compressive sensing (CS) and matrix sensing (MS) techniques have been applied to the synthetic aperture radar (SAR) imaging problem to reduce the sampling amount of SAR echo using the sparse or low-rank prior information. To further exploit the redundancy and improve sampling efficiency, we take a different approach, wherein a deep SAR imaging algorithm is proposed. The main idea is to exploit the redundancy of the backscattering coefficient using an auto-encoder structure, wherein the hidden latent layer in auto-encoder has lower dimension and less parameters than the backscattering coefficient layer. Based on the auto-encoder model, the parameters of the auto-encoder structure and the backscattering coefficient are estimated simultaneously by optimizing the reconstruction loss associated with the down-sampled SAR echo. In addition, in order to meet the practical application requirements, a deep SAR motion compensation algorithm is proposed to eliminate the effect of motion errors on imaging results. The effectiveness of the proposed algorithms is verified on both simulated and real SAR data.
Collapse
|
10
|
A Novel Post-Doppler Parametric Adaptive Matched Filter for Airborne Multichannel Radar. REMOTE SENSING 2020. [DOI: 10.3390/rs12244017] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The post-Doppler adaptive matched filter (PD-AMF) with constant false alarm rate (CFAR) property was developed for adaptive detection of moving targets, which is a standardized version of the post-Doppler space–time adaptive processing (PD-STAP) in practical applications. However, its detection performance is severely constrained by the training data, especially in a dense signal environment. Improper training data and contamination of moving target signals remarkably degrade the performance of disturbance suppression and result in target cancellation by self-whitening. To address these issues, a novel post-Doppler parametric adaptive matched filter (PD-PAMF) detector is proposed in the range-Doppler domain. Specifically, the detector is introduced via the post-Doppler matched filter (PD-MF) and the lower-diagonal-upper (LDU) decomposition of the disturbance covariance matrix, and the disturbance signals of the spatial sequence are modelled as an auto-regressive (AR) process for filtering. The purpose of detecting ground moving targets as well as for estimating their geographical positions and line-of-sight velocities is achieved when the disturbance is suppressed. The PD-PAMF is able to reach higher performances by using only a smaller training data size. More importantly, it is tolerant to moving target signals contained in the training data. The PD-PAMF also has a lower computational complexity. Numerical results are presented to demonstrate the effectiveness of the proposed detector.
Collapse
|
11
|
Pu W, Wu J, Huang Y, Yang J. Squinted Airborne Synthetic Aperture Radar Imaging with Unknown Curved Trajectory. SENSORS 2020; 20:s20216026. [PMID: 33114022 PMCID: PMC7672571 DOI: 10.3390/s20216026] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Revised: 10/06/2020] [Accepted: 10/12/2020] [Indexed: 11/19/2022]
Abstract
The imagery of airborne highly squinted synthetic aperture radar (SAR) with curved trajectory is a challenging task due to the translational-variant range cell migration (RCM) and azimuth modulation. However, in most cases of practical application, the curved trajectory cannot be accurately known, which brings greater difficulties to the imaging problem. To accommodate these issues, we propose a novel motion modelling and optimisation based imaging algorithm for the highly squinted SAR with unknown curved trajectory. First, to correct the translational-variant RCM, a coarse-to-fine RCM correction scheme as well as a range perturbation approach is applied. Afterwards, an optimisation model of motion information under the criterion of minimum entropy is built during the azimuth processing by nonlinear chirp scaling (NLCS). Correspondingly, a differential evolution (DE) optimisation strategy is proposed to estimate the motion information in an iterative manner. We empirically compare the proposed algorithms with several state-of-the-art highly squinted curved SAR imaging algorithms. Numerical results show the effectiveness of the proposed method in the case without any prior information of the curved trajectory.
Collapse
Affiliation(s)
- Wei Pu
- Department of Electronic and Electrical Engineering, University College London, London WC1E 6BT, UK
- Correspondence: ; Tel.: +86-28-6183-1369
| | - Junjie Wu
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, 2006 Xiyuan Road, Gaoxin Western District, Chengdu 611731, China; (J.W.); (Y.H.); (J.Y.)
| | - Yulin Huang
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, 2006 Xiyuan Road, Gaoxin Western District, Chengdu 611731, China; (J.W.); (Y.H.); (J.Y.)
| | - Jianyu Yang
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, 2006 Xiyuan Road, Gaoxin Western District, Chengdu 611731, China; (J.W.); (Y.H.); (J.Y.)
| |
Collapse
|
12
|
Abstract
Stable and efficient ground moving target tracking and refocusing is a hard task in synthetic aperture radar (SAR) data processing. Since shadows in video-SAR indicate the actual positions of moving targets at different moments without any displacement, shadow-based methods provide a new approach for ground moving target processing. This paper constructs a novel framework to refocus ground moving targets by using shadows in video-SAR. To this end, an automatic-registered SAR video is first obtained using the video-SAR back-projection (v-BP) algorithm. The shadows of multiple moving targets are then tracked using a learning-based tracker, and the moving targets are ultimately refocused via a proposed moving target back-projection (m-BP) algorithm. With this framework, we can perform detecting, tracking, imaging for multiple moving targets integratedly, which significantly improves the ability of moving-target surveillance for SAR systems. Furthermore, a detailed explanation of the shadow of a moving target is presented herein. We find that the shadow of ground moving targets is affected by a target’s size, radar pitch angle, carrier frequency, synthetic aperture time, etc. With an elaborate system design, we can obtain a clear shadow of moving targets even in X or C band. By numerical experiments, we find that a deep network, such as SiamFc, can easily track shadows and precisely estimate the trajectories that meet the accuracy requirement of the trajectories for m-BP.
Collapse
|