1
|
Zhang M, Li J, Zheng X. Semantic embedding based online cross-modal hashing method. Sci Rep 2024; 14:736. [PMID: 38184671 PMCID: PMC10771426 DOI: 10.1038/s41598-023-50242-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 12/17/2023] [Indexed: 01/08/2024] Open
Abstract
Hashing has been extensively utilized in cross-modal retrieval due to its high efficiency in handling large-scale, high-dimensional data. However, most existing cross-modal hashing methods operate as offline learning models, which learn hash codes in a batch-based manner and prove to be inefficient for streaming data. Recently, several online cross-modal hashing methods have been proposed to address the streaming data scenario. Nevertheless, these methods fail to fully leverage the semantic information and accurately optimize hashing in a discrete fashion. As a result, both the accuracy and efficiency of online cross-modal hashing methods are not ideal. To address these issues, this paper introduces the Semantic Embedding-based Online Cross-modal Hashing (SEOCH) method, which integrates semantic information exploitation and online learning into a unified framework. To exploit the semantic information, we map the semantic labels to a latent semantic space and construct a semantic similarity matrix to preserve the similarity between new data and existing data in the Hamming space. Moreover, we employ a discrete optimization strategy to enhance the efficiency of cross-modal retrieval for online hashing. Through extensive experiments on two publicly available multi-label datasets, we demonstrate the superiority of the SEOCH method.
Collapse
Affiliation(s)
- Meijia Zhang
- School of Data Science and Computer Science, Shandong Women's University, Jinan, 250300, China
- Shandong Provincial Key Laboratory of Network Based Intelligent Computing, Jinan, 250022, China
| | - Junzheng Li
- Network Information Management Center, Shandong Management University, Jinan, 250357, China
| | - Xiyuan Zheng
- School of Data Science and Computer Science, Shandong Women's University, Jinan, 250300, China.
| |
Collapse
|
2
|
Wang Y, Chen ZD, Luo X, Li R, Xu XS. Fast Cross-Modal Hashing With Global and Local Similarity Embedding. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:10064-10077. [PMID: 33750723 DOI: 10.1109/tcyb.2021.3059886] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Recently, supervised cross-modal hashing has attracted much attention and achieved promising performance. To learn hash functions and binary codes, most methods globally exploit the supervised information, for example, preserving an at-least-one pairwise similarity into hash codes or reconstructing the label matrix with binary codes. However, due to the hardness of the discrete optimization problem, they are usually time consuming on large-scale datasets. In addition, they neglect the class correlation in supervised information. From another point of view, they only explore the global similarity of data but overlook the local similarity hidden in the data distribution. To address these issues, we present an efficient supervised cross-modal hashing method, that is, fast cross-modal hashing (FCMH). It leverages not only global similarity information but also the local similarity in a group. Specifically, training samples are partitioned into groups; thereafter, the local similarity in each group is extracted. Moreover, the class correlation in labels is also exploited and embedded into the learning of binary codes. In addition, to solve the discrete optimization problem, we further propose an efficient discrete optimization algorithm with a well-designed group updating scheme, making its computational complexity linear to the size of the training set. In light of this, it is more efficient and scalable to large-scale datasets. Extensive experiments on three benchmark datasets demonstrate that FCMH outperforms some state-of-the-art cross-modal hashing approaches in terms of both retrieval accuracy and learning efficiency.
Collapse
|
3
|
Huang F, Zhang L, Gao X. Domain Adaptation Preconceived Hashing for Unconstrained Visual Retrieval. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5641-5655. [PMID: 33852407 DOI: 10.1109/tnnls.2021.3071127] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Learning to hash has been widely applied for image retrieval due to the low storage and high retrieval efficiency. Existing hashing methods assume that the distributions of the retrieval pool (i.e., the data sets being retrieved) and the query data are similar, which, however, cannot truly reflect the real-world condition due to the unconstrained visual cues, such as illumination, pose, background, and so on. Due to the large distribution gap between the retrieval pool and the query set, the performances of traditional hashing methods are seriously degraded. Therefore, we first propose a new efficient but transferable hashing model for unconstrained cross-domain visual retrieval, in which the retrieval pool and the query sample are drawn from different but semantic relevant domains. Specifically, we propose a simple yet effective unsupervised hashing method, domain adaptation preconceived hashing (DAPH), toward learning domain-invariant hashing representation. Three merits of DAPH are observed: 1) to the best of our knowledge, we first propose unconstrained visual retrieval by introducing DA into hashing for learning transferable hashing codes; 2) a domain-invariant feature transformation with marginal discrepancy distance minimization and feature reconstruction constraint is learned, such that the hashing code is not only domain adaptive but content preserved; and 3) a DA preconceived quantization loss is proposed, which further guarantees the discrimination of the learned hashing code for sample retrieval. Extensive experiments on various benchmark data sets verify that our DAPH outperforms many state-of-the-art hashing methods toward unconstrained (unrestricted) instance retrieval in both single- and cross-domain scenarios.
Collapse
|
4
|
Qin J, Fei L, Zhang Z, Wen J, Xu Y, Zhang D. Joint Specifics and Consistency Hash Learning for Large-Scale Cross-Modal Retrieval. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:5343-5358. [PMID: 35925845 DOI: 10.1109/tip.2022.3195059] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
With the dramatic increase in the amount of multimedia data, cross-modal similarity retrieval has become one of the most popular yet challenging problems. Hashing offers a promising solution for large-scale cross-modal data searching by embedding the high-dimensional data into the low-dimensional similarity preserving Hamming space. However, most existing cross-modal hashing usually seeks a semantic representation shared by multiple modalities, which cannot fully preserve and fuse the discriminative modal-specific features and heterogeneous similarity for cross-modal similarity searching. In this paper, we propose a joint specifics and consistency hash learning method for cross-modal retrieval. Specifically, we introduce an asymmetric learning framework to fully exploit the label information for discriminative hash code learning, where 1) each individual modality can be better converted into a meaningful subspace with specific information, 2) multiple subspaces are semantically connected to capture consistent information, and 3) the integration complexity of different subspaces is overcome so that the learned collaborative binary codes can merge the specifics with consistency. Then, we introduce an alternatively iterative optimization to tackle the specifics and consistency hashing learning problem, making it scalable for large-scale cross-modal retrieval. Extensive experiments on five widely used benchmark databases clearly demonstrate the effectiveness and efficiency of our proposed method on both one-cross-one and one-cross-two retrieval tasks.
Collapse
|
5
|
Learning ordinal constraint binary codes for fast similarity search. Inf Process Manag 2022. [DOI: 10.1016/j.ipm.2022.102919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
6
|
|
7
|
Chen ZD, Luo X, Wang Y, Guo S, Xu XS. Fine-Grained Hashing With Double Filtering. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:1671-1683. [PMID: 35085079 DOI: 10.1109/tip.2022.3145159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Fine-grained hashing is a new topic in the field of hashing-based retrieval and has not been well explored up to now. In this paper, we raise three key issues that fine-grained hashing should address simultaneously, i.e., fine-grained feature extraction, feature refinement as well as a well-designed loss function. In order to address these issues, we propose a novel Fine-graIned haSHing method with a double-filtering mechanism and a proxy-based loss function, FISH for short. Specifically, the double-filtering mechanism consists of two modules, i.e., Space Filtering module and Feature Filtering module, which address the fine-grained feature extraction and feature refinement issues, respectively. Thereinto, the Space Filtering module is designed to highlight the critical regions in images and help the model to capture more subtle and discriminative details; the Feature Filtering module is the key of FISH and aims to further refine extracted features by supervised re- weighting and enhancing. Moreover, the proxy-based loss is adopted to train the model by preserving similarity relationships between data instances and proxy-vectors of each class rather than other data instances, further making FISH much efficient and effective. Experimental results demonstrate that FISH achieves much better retrieval performance compared with state-of-the-art fine-grained hashing methods, and converges very fast. The source code is publicly available: https://github.com/chenzhenduo/FISH.
Collapse
|
8
|
Boundary-Aware Hashing for Hamming Space Retrieval. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12010508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Hamming space retrieval is a hot area of research in deep hashing because it is effective for large-scale image retrieval. Existing hashing algorithms have not fully used the absolute boundary to discriminate the data inside and outside the Hamming ball, and the performance is not satisfying. In this paper, a boundary-aware contrastive loss is designed. It involves an exponential function with absolute boundary (i.e., Hamming radius) information for dissimilar pairs and a logarithmic function to encourage small distance for similar pairs. It achieves a push that is bigger than the pull inside the Hamming ball, and the pull is bigger than the push outside the ball. Furthermore, a novel Boundary-Aware Hashing (BAH) architecture is proposed. It discriminatively penalizes the dissimilar data inside and outside the Hamming ball. BAH enables the influence of extremely imbalanced data to be reduced without up-weight to similar pairs or other optimization strategies because its exponential function rapidly converges outside the absolute boundary, making a huge contrast difference between the gradients of the logarithmic and exponential functions. Extensive experiments conducted on four benchmark datasets show that the proposed BAH obtains higher performance for different code lengths, and it has the advantage of handling extremely imbalanced data.
Collapse
|
9
|
Yu E, Ma J, Sun J, Chang X, Zhang H, Hauptmann AG. Deep Discrete Cross-Modal Hashing with Multiple Supervision. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.11.035] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
10
|
Yang Z, Yang L, Huang W, Sun L, Long J. Enhanced Deep Discrete Hashing with semantic-visual similarity for image retrieval. Inf Process Manag 2021. [DOI: 10.1016/j.ipm.2021.102648] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
11
|
Guo JN, Mao XL, Lin SY, Wei W, Huang H. Deep kernel supervised hashing for node classification in structural networks. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.03.068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
12
|
Yang Z, Yang L, Raymond OI, Zhu L, Huang W, Liao Z, Long J. NSDH: A Nonlinear Supervised Discrete Hashing framework for large-scale cross-modal retrieval. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.106818] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
13
|
Scalable deep asymmetric hashing via unequal-dimensional embeddings for image similarity search. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.06.036] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
14
|
Zhu L, Lu X, Cheng Z, Li J, Zhang H. Deep Collaborative Multi-view Hashing for Large-scale Image Search. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 29:4643-4655. [PMID: 32092006 DOI: 10.1109/tip.2020.2974065] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Hashing could significantly accelerate large-scale image search by transforming the high-dimensional features into binary Hamming space, where efficient similarity search can be achieved with very fast Hamming distance computation and extremely low storage cost. As an important branch of hashing methods, multi-view hashing takes advantages of multiple features from different views for binary hash learning. However, existing multi-view hashing methods are either based on shallow models which fail to fully capture the intrinsic correlations of heterogeneous views, or unsupervised deep models which suffer from insufficient semantics and cannot effectively exploit the complementarity of view features. In this paper, we propose a novel Deep Collaborative Multi-view Hashing (DCMVH) method to deeply fuse multi-view features and learn multi-view hash codes collaboratively under a deep architecture. DCMVH is a new deep multi-view hash learning framework. It mainly consists of 1) multiple view-specific networks to extract hidden representations of different views, and 2) a fusion network to learn multi-view fused hash code. DCMVH associates different layers with instance-wise and pair-wise semantic labels respectively. In this way, the discriminative capability of representation layers can be progressively enhanced and meanwhile the complementarity of different view features can be exploited effectively. Finally, we develop a fast discrete hash optimization method based on augmented Lagrangian multiplier to efficiently solve the binary hash codes. Experiments on public multi-view image search datasets demonstrate our approach achieves substantial performance improvement over state-of-the-art methods.
Collapse
|