51
|
Crafting adversarial example with adaptive root mean square gradient on deep neural networks. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.01.084] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
52
|
Bai X, Zhu L, Liang C, Li J, Nie X, Chang X. Multi-view feature selection via Nonnegative Structured Graph Learning. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.01.044] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
53
|
Yao T, Han Y, Wang R, Kong X, Yan L, Fu H, Tian Q. Efficient discrete supervised hashing for large-scale cross-modal retrieval. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.12.086] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
54
|
Semantic image segmentation with shared decomposition convolution and boundary reinforcement structure. APPL INTELL 2020. [DOI: 10.1007/s10489-020-01671-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
55
|
Chang D, Ding Y, Xie J, Bhunia AK, Li X, Ma Z, Wu M, Guo J, Song YZ. The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image Classification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 29:4683-4695. [PMID: 32092002 DOI: 10.1109/tip.2020.2973812] [Citation(s) in RCA: 67] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The key to solving fine-grained image categorization is finding discriminate and local regions that correspond to subtle visual traits. Great strides have been made, with complex networks designed specifically to learn part-level discriminate feature representations. In this paper, we show that it is possible to cultivate subtle details without the need for overly complicated network designs or training mechanisms - a single loss is all it takes. The main trick lies with how we delve into individual feature channels early on, as opposed to the convention of starting from a consolidated feature map. The proposed loss function, termed as mutual-channel loss (MC-Loss), consists of two channel-specific components: a discriminality component and a diversity component. The discriminality component forces all feature channels belonging to the same class to be discriminative, through a novel channel-wise attention mechanism. The diversity component additionally constraints channels so that they become mutually exclusive across the spatial dimension. The end result is therefore a set of feature channels, each of which reflects different locally discriminative regions for a specific class. The MC-Loss can be trained end-to-end, without the need for any bounding-box/part annotations, and yields highly discriminative regions during inference. Experimental results show our MC-Loss when implemented on top of common base networks can achieve state-of-the-art performance on all four fine-grained categorization datasets (CUB-Birds, FGVC-Aircraft, Flowers-102, and Stanford Cars). Ablative studies further demonstrate the superiority of the MC-Loss when compared with other recently proposed general-purpose losses for visual classification, on two different base networks.
Collapse
|
56
|
Zheng X, Chen X, Lu X. A Joint Relationship Aware Neural Network for Single-Image 3D Human Pose Estimation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 29:4747-4758. [PMID: 32070954 DOI: 10.1109/tip.2020.2972104] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This paper studies the task of 3D human pose estimation from a single RGB image, which is challenging without depth information. Recently many deep learning methods are proposed and achieve great improvements due to their strong representation learning. However, most existing methods ignore the relationship between joint features. In this paper, a joint relationship aware neural network is proposed to take both global and local joint relationship into consideration. First, a whole feature block representing all human body joints is extracted by a convolutional neural network. A Dual Attention Module (DAM) is applied on the whole feature block to generate attention weights. By exploiting the attention module, the global relationship between the whole joints is encoded. Second, the weighted whole feature block is divided into some individual joint features. To capture salient joint feature, the individual joint features are refined by individual DAMs. Finally, a joint angle prediction constraint is proposed to consider local joint relationship. Quantitative and qualitative experiments on 3D human pose estimation benchmarks demonstrate the effectiveness of the proposed method.
Collapse
|
57
|
Yao T, Yan L, Ma Y, Yu H, Su Q, Wang G, Tian Q. Fast discrete cross-modal hashing with semantic consistency. Neural Netw 2020; 125:142-152. [PMID: 32088568 DOI: 10.1016/j.neunet.2020.01.035] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2019] [Revised: 12/17/2019] [Accepted: 01/28/2020] [Indexed: 11/25/2022]
Abstract
Supervised cross-modal hashing has attracted widespread concentrations for large-scale retrieval task due to its promising retrieval performance. However, most existing works suffer from some of following issues. Firstly, most of them only leverage the pair-wise similarity matrix to learn hash codes, which may result in class information loss. Secondly, the pair-wise similarity matrix generally lead to high computing complexity and memory cost. Thirdly, most of them relax the discrete constraints during optimization, which generally results in large cumulative quantization error and consequent inferior hash codes. To address above problems, we present a Fast Discrete Cross-modal Hashing method in this paper, FDCH for short. Specifically, it firstly leverages both class labels and the pair-wise similarity matrix to learn a sharing Hamming space where the semantic consistency can be better preserved. Then we propose an asymmetric hash codes learning model to avoid the challenging issue of symmetric matrix factorization. Finally, an effective and efficient discrete optimal scheme is designed to generate discrete hash codes directly, and the computing complexity and memory cost caused by the pair-wise similarity matrix are reduced from O(n2) to O(n), where n denotes the size of training set. Extensive experiments conducted on three real world datasets highlight the superiority of FDCH compared with several cross-modal hashing methods and demonstrate its effectiveness and efficiency.
Collapse
Affiliation(s)
- Tao Yao
- Department of Information and Electrical Engineering, Ludong University, Yantai, 264000, China; Yantai Research Institute of New Generation Information Technology, Southwest Jiaotong University, 264000, China.
| | - Lianshan Yan
- Yantai Research Institute of New Generation Information Technology, Southwest Jiaotong University, 264000, China
| | - Yilan Ma
- Department of Information and Electrical Engineering, Ludong University, Yantai, 264000, China
| | - Hong Yu
- Department of Information and Electrical Engineering, Ludong University, Yantai, 264000, China
| | - Qingtang Su
- Department of Information and Electrical Engineering, Ludong University, Yantai, 264000, China
| | - Gang Wang
- Department of Information and Electrical Engineering, Ludong University, Yantai, 264000, China
| | - Qi Tian
- Huawei Noah's Ark Lab, 518129, China
| |
Collapse
|
58
|
Xie D, Deng C, Li C, Liu X, Tao D. Multi-Task Consistency-Preserving Adversarial Hashing for Cross-Modal Retrieval. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 29:3626-3637. [PMID: 31940536 DOI: 10.1109/tip.2020.2963957] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Owing to the advantages of low storage cost and high query efficiency, cross-modal hashing has received increasing attention recently. As failing to bridge the inherent modality gap between modalities, most existing cross-modal hashing methods have limited capability to explore the semantic consistency information between different modality data, leading to unsatisfactory search performance. To address this problem, we propose a novel deep hashing method named Multi-Task Consistency- Preserving Adversarial Hashing (CPAH) to fully explore the semantic consistency and correlation between different modalities for efficient cross-modal retrieval. First, we design a consistency refined module (CR) to divide the representations of different modality into two irrelevant parts, i.e., modality-common and modality-private representations. Then, a multi-task adversarial learning module (MA) is presented, which can make the modality-common representation of different modalities close to each other on feature distribution and semantic consistency. Finally, the compact and powerful hash codes can be generated from modality-common representation. Comprehensive evaluations conducted on three representative cross-modal benchmark datasets illustrate our method is superior to the state-of-the-art cross-modal hashing methods.
Collapse
|
59
|
|
60
|
Gu Y, Wang S, Zhang H, Yao Y, Yang W, Liu L. Clustering-driven unsupervised deep hashing for image retrieval. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.08.050] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
61
|
Zhang Z, Lai Z, Huang Z, Wong WK, Xie GS, Liu L, Shao L. Scalable Supervised Asymmetric Hashing With Semantic and Latent Factor Embedding. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:4803-4818. [PMID: 31071030 DOI: 10.1109/tip.2019.2912290] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Compact hash code learning has been widely applied to fast similarity search owing to its significantly reduced storage and highly efficient query speed. However, it is still a challenging task to learn discriminative binary codes for perfectly preserving the full pairwise similarities embedded in the high-dimensional real-valued features, such that the promising performance can be guaranteed. To overcome this difficulty, in this paper, we propose a novel scalable supervised asymmetric hashing (SSAH) method, which can skillfully approximate the full-pairwise similarity matrix based on maximum asymmetric inner product of two different non-binary embeddings. In particular, to comprehensively explore the semantic information of data, the supervised label information and the refined latent feature embedding are simultaneously considered to construct the high-quality hashing function and boost the discriminant of the learned binary codes. Specifically, SSAH learns two distinctive hashing functions in conjunction of minimizing the regression loss on the semantic label alignment and the encoding loss on the refined latent features. More importantly, instead of using only part of similarity correlations of data, the full-pairwise similarity matrix is directly utilized to avoid information loss and performance degeneration, and its cumbersome computation complexity on n ×n matrix can be dexterously manipulated during the optimization phase. Furthermore, an efficient alternating optimization scheme with guaranteed convergence is designed to address the resulting discrete optimization problem. The encouraging experimental results on diverse benchmark datasets demonstrate the superiority of the proposed SSAH method in comparison with many recently proposed hashing algorithms.
Collapse
|
62
|
Enhanced Feature Representation in Detection for Optical Remote Sensing Images. REMOTE SENSING 2019. [DOI: 10.3390/rs11182095] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In recent years, deep learning has led to a remarkable breakthrough in object detection in remote sensing images. In practice, two-stage detectors perform well regarding detection accuracy but are slow. On the other hand, one-stage detectors integrate the detection pipeline of two-stage detectors to simplify the detection process, and are faster, but with lower detection accuracy. Enhancing the capability of feature representation may be a way to improve the detection accuracy of one-stage detectors. For this goal, this paper proposes a novel one-stage detector with enhanced capability of feature representation. The enhanced capability benefits from two proposed structures: dual top-down module and dense-connected inception module. The former efficiently utilizes multi-scale features from multiple layers of the backbone network. The latter both widens and deepens the network to enhance the ability of feature representation with limited extra computational cost. To evaluate the effectiveness of proposed structures, we conducted experiments on horizontal bounding box detection tasks on the challenging DOTA dataset and gained 73.49% mean Average Precision (mAP), achieving state-of-the-art performance. Furthermore, our method ran significantly faster than the best public two-stage detector on the DOTA dataset.
Collapse
|