51
|
|
52
|
Bing Z, Lemke C, Cheng L, Huang K, Knoll A. Energy-efficient and damage-recovery slithering gait design for a snake-like robot based on reinforcement learning and inverse reinforcement learning. Neural Netw 2020; 129:323-333. [PMID: 32593929 DOI: 10.1016/j.neunet.2020.05.029] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 04/18/2020] [Accepted: 05/24/2020] [Indexed: 10/24/2022]
Abstract
Similar to real snakes in nature, the flexible trunks of snake-like robots enhance their movement capabilities and adaptabilities in diverse environments. However, this flexibility corresponds to a complex control task involving highly redundant degrees of freedom, where traditional model-based methods usually fail to propel the robots energy-efficiently and adaptively to unforeseeable joint damage. In this work, we present an approach for designing an energy-efficient and damage-recovery slithering gait for a snake-like robot using the reinforcement learning (RL) algorithm and the inverse reinforcement learning (IRL) algorithm. Specifically, we first present an RL-based controller for generating locomotion gaits at a wide range of velocities, which is trained using the proximal policy optimization (PPO) algorithm. Then, by taking the RL-based controller as an expert and collecting trajectories from it, we train an IRL-based controller using the adversarial inverse reinforcement learning (AIRL) algorithm. For the purpose of comparison, a traditional parameterized gait controller is presented as the baseline and the parameter sets are optimized using the grid search and Bayesian optimization algorithm. Based on the analysis of the simulation results, we first demonstrate that this RL-based controller exhibits very natural and adaptive movements, which are also substantially more energy-efficient than the gaits generated by the parameterized controller. We then demonstrate that the IRL-based controller cannot only exhibit similar performances as the RL-based controller, but can also recover from the unpredictable damage body joints and still outperform the model-based controller, which has an undamaged body, in terms of energy efficiency. Videos can be viewed at https://videoviewsite.wixsite.com/rlsnake.
Collapse
Affiliation(s)
- Zhenshan Bing
- Department of Computer Science, Technical University of Munich, Germany.
| | - Christian Lemke
- Department of Computer Science, Ludwig Maximilian University of Munich, Germany.
| | - Long Cheng
- College of Computer Science and Artificial Intelligence, Wenzhou University, China.
| | - Kai Huang
- School of Data and Computer Science, Sun Yat-Sen University, China.
| | - Alois Knoll
- Department of Computer Science, Technical University of Munich, Germany.
| |
Collapse
|
53
|
Shen J, Tang X, Dong X, Shao L. Visual Object Tracking by Hierarchical Attention Siamese Network. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:3068-3080. [PMID: 31536029 DOI: 10.1109/tcyb.2019.2936503] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Visual tracking addresses the problem of localizing an arbitrary target in video according to the annotated bounding box. In this article, we present a novel tracking method by introducing the attention mechanism into the Siamese network to increase its matching discrimination. We propose a new way to compute attention weights to improve matching performance by a sub-Siamese network [Attention Net (A-Net)], which locates attentive parts for solving the searching problem. In addition, features in higher layers can preserve more semantic information while features in lower layers preserve more location information. Thus, in order to solve the tracking failure cases by the higher layer features, we fully utilize location and semantic information by multilevel features and propose a new way to fuse multiscale response maps from each layer to obtain a more accurate position estimation of the object. We further propose a hierarchical attention Siamese network by combining the attention weights and multilayer integration for tracking. Our method is implemented with a pretrained network which can outperform most well-trained Siamese trackers even without any fine-tuning and online updating. The comparison results with the state-of-the-art methods on popular tracking benchmarks show that our method achieves better performance. Our source code and results will be available at https://github.com/shenjianbing/HASN.
Collapse
|
54
|
Li T, Song H, Zhang K, Liu Q. Recurrent reverse attention guided residual learning for saliency object detection. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.12.109] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
55
|
Wu Y, Li J, Wu J, Chang J. Siamese capsule networks with global and local features for text classification. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.01.064] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
56
|
Tang Y, Yang X, Wang N, Song B, Gao X. CGAN-TM: A novel domain-to-domain transferring method for person re-identification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 29:5641-5651. [PMID: 32286985 DOI: 10.1109/tip.2020.2985545] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Person re-identification (re-ID) is a technique aiming to recognize person cross different cameras. Although some supervised methods have achieved favorable performance, they are far from practical application owing to the lack of labeled data. Thus, unsupervised person re-ID methods are in urgent need. Generally, the commonly used approach in existing unsupervised methods is to first utilize the source image dataset for generating a model in supervised manner, and then transfer the source image domain to the target image domain. However, images may lose their identity information after translation, and the distributions between different domains are far away. To solve these problems, we propose an image domain-to-domain translation method by keeping pedestrian's identity information and pulling closer the domains' distributions for unsupervised person re-ID tasks. Our work exploits the CycleGAN to transfer the existing labeled image domain to the unlabeled image domain. Specially, a Self-labeled Triplet Net is proposed to maintain the pedestrian identity information, and maximum mean discrepancy is introduced to pull the domain distribution closer. Extensive experiments have been conducted and the results demonstrate that the proposed method performs superiorly than the state-ofthe- art unsupervised methods on DukeMTMC-reID and Market- 1501.
Collapse
|
57
|
|
58
|
Yan M, Wang J, Li J, Zhang K, Yang Z. Traffic scene semantic segmentation using self-attention mechanism and bi-directional GRU to correlate context. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.12.007] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
59
|
Du Y, Yan Y, Chen S, Hua Y. Object-adaptive LSTM network for real-time visual tracking with adversarial data augmentation. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.12.022] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
60
|
Ruan W, Liang C, Yu Y, Chen J, Hu R. SIST: Online Scale-Adaptive Object tracking with Stepwise Insight. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.11.102] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
61
|
Cheng Y, Zhang X, Lu F, Lu F, Sato Y. Gaze Estimation by Exploring Two-Eye Asymmetry. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 29:5259-5272. [PMID: 32224460 DOI: 10.1109/tip.2020.2982828] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Eye gaze estimation is increasingly demanded by recent intelligent systems to facilitate a range of interactive applications. Unfortunately, learning the highly complicated regression from a single eye image to the gaze direction is not trivial. Thus, the problem is yet to be solved efficiently. Inspired by the two-eye asymmetry as two eyes of the same person may appear uneven, we propose the face-based asymmetric regression-evaluation network (FARE-Net) to optimize the gaze estimation results by considering the difference between left and right eyes. The proposed method includes one face-based asymmetric regression network (FAR-Net) and one evaluation network (E-Net). The FAR-Net predicts 3D gaze directions for both eyes and is trained with the asymmetric mechanism, which asymmetrically weights and sums the loss generated by two-eye gaze directions. With the asymmetric mechanism, the FAR-Net utilizes the eyes that can achieve high performance to optimize network. The E-Net learns the reliabilities of two eyes to balance the learning of the asymmetric mechanism and symmetric mechanism. Our FARENet achieves leading performances on MPIIGaze, EyeDiap and RT-Gene datasets. Additionally, we investigate the effectiveness of FARE-Net by analyzing the distribution of errors and ablation study.
Collapse
|
62
|
Yang Z, Yu H, Feng M, Sun W, Lin X, Sun M, Mao ZH, Mian A. Small Object Augmentation of Urban Scenes for Real-Time Semantic Segmentation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 29:5175-5190. [PMID: 32191886 DOI: 10.1109/tip.2020.2976856] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Semantic segmentation is a key step in scene understanding for autonomous driving. Although deep learning has significantly improved the segmentation accuracy, current highquality models such as PSPNet and DeepLabV3 are inefficient given their complex architectures and reliance on multi-scale inputs. Thus, it is difficult to apply them to real-time or practical applications. On the other hand, existing real-time methods cannot yet produce satisfactory results on small objects such as traffic lights, which are imperative to safe autonomous driving. In this paper, we improve the performance of real-time semantic segmentation from two perspectives, methodology and data. Specifically, we propose a real-time segmentation model coined Narrow Deep Network (NDNet) and build a synthetic dataset by inserting additional small objects into the training images. The proposed method achieves 65.7% mean intersection over union (mIoU) on the Cityscapes test set with only 8.4G floatingpoint operations (FLOPs) on 1024×2048 inputs. Furthermore, by re-training the existing PSPNet and DeepLabV3 models on our synthetic dataset, we obtained an average 2% mIoU improvement on small objects.
Collapse
|
63
|
|
64
|
Tang Y, Zou W, Hua Y, Jin Z, Li X. Video salient object detection via spatiotemporal attention neural networks. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.09.064] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
65
|
|
66
|
Wang Z, Tian G. Integrating manifold ranking with boundary expansion and corners clustering for saliency detection of home scene. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.10.063] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
67
|
|
68
|
Liang Z, Shen J. Local Semantic Siamese Networks for Fast Tracking. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:3351-3364. [PMID: 31869793 DOI: 10.1109/tip.2019.2959256] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Learning a powerful feature representation is critical for constructing a robust Siamese tracker. However, most existing Siamese trackers learn the global appearance features of the entire object, which usually suffers from drift problems caused by partial occlusion or non-rigid appearance deformation. In this paper, we propose a new Local Semantic Siamese (LSSiam) network to extract more robust features for solving these drift problems, since the local semantic features contain more fine-grained and partial information. We learn the semantic features during offline training by adding a classification branch into the classical Siamese framework. To further enhance the representation of features, we design a generally focal logistic loss to mine the hard negative samples. During the online tracking, we remove the classification branch and propose an efficient template updating strategy to avoid aggressive computing load. Thus, the proposed tracker can run at a high-speed of 100 Frame-per-Second (FPS) far beyond real-time requirement. Extensive experiments on popular benchmarks demonstrate the proposed LSSiam tracker achieves the state-of-the-art performance with a high-speed. Our source code is available at.
Collapse
|
69
|
|
70
|
Huang M, Liu Z, Ye L, Zhou X, Wang Y. Saliency detection via multi-level integration and multi-scale fusion neural networks. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.07.054] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
71
|
Mustafa A, Khan SH, Hayat M, Shen J, Shao L. Image Super-Resolution as a Defense Against Adversarial Attacks. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:1711-1724. [PMID: 31545722 DOI: 10.1109/tip.2019.2940533] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Convolutional Neural Networks have achieved significant success across multiple computer vision tasks. However, they are vulnerable to carefully crafted, human-imperceptible adversarial noise patterns which constrain their deployment in critical security-sensitive systems. This paper proposes a computationally efficient image enhancement approach that provides a strong defense mechanism to effectively mitigate the effect of such adversarial perturbations. We show that deep image restoration networks learn mapping functions that can bring off-the-manifold adversarial samples onto the natural image manifold, thus restoring classification towards correct classes. A distinguishing feature of our approach is that, in addition to providing robustness against attacks, it simultaneously enhances image quality and retains models performance on clean images. Furthermore, the proposed method does not modify the classifier or requires a separate mechanism to detect adversarial images. The effectiveness of the scheme has been demonstrated through extensive experiments, where it has proven a strong defense in gray-box settings. The proposed scheme is simple and has the following advantages: (1) it does not require any model training or parameter optimization, (2) it complements other existing defense mechanisms, (3) it is agnostic to the attacked model and attack type and (4) it provides superior performance across all popular attack algorithms. Our codes are publicly available at https://github.com/aamir-mustafa/super-resolution-adversarial-defense.
Collapse
|
72
|
Chen C, Wang G, Peng C, Zhang X, Qin H. Improved Robust Video Saliency Detection based on Long-term Spatial-temporal Information. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:1090-1100. [PMID: 31449017 DOI: 10.1109/tip.2019.2934350] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
This paper proposes to utilize supervised deep convolutional neural networks to take full advantage of the long-term spatial-temporal information in order to improve the video saliency detection performance. The conventional methods, which use the temporally neighbored frames solely, could easily encounter transient failure cases when the spatial-temporal saliency clues are less-trustworthy for a long period. To tackle the aforementioned limitation, we plan to identify those beyond-scope frames with trustworthy long-term saliency clues first and then align it with the current problem domain for an improved video saliency detection.
Collapse
|
73
|
Yang H, Bath PA. The Use of Data Mining Methods for the Prediction of Dementia: Evidence From the English Longitudinal Study of Aging. IEEE J Biomed Health Inform 2019; 24:345-353. [PMID: 31180874 DOI: 10.1109/jbhi.2019.2921418] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Dementia in older age is a major health concern with the increase in the aging population. Preventive measures to prevent or delay dementia symptoms are of utmost importance. In this study, a large and wide variety of factors from multiple domains were investigated using a large nationally representative sample of older people from the English Longitudinal Study of Ageing. Seven machine learning algorithms were implemented to build predictive models for performance comparison. A simple model ensemble approach was used to combine the prediction results of individual base models to further improve predictive power. A series of important factors in each domain area were identified. The findings from this study provide new evidence on factors that are associated with the dementia in later life. This information will help our understanding of potential risk factors for dementia and identify warning signs of the early stages of dementia. Longitudinal research is required to establish which factors may be causative and which factors may be a consequence of dementia.
Collapse
|