1
|
Jin L, Li Z, Pan Y, Tang J. Relational Consistency Induced Self-Supervised Hashing for Image Retrieval. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:1482-1494. [PMID: 37995167 DOI: 10.1109/tnnls.2023.3333294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2023]
Abstract
This article proposes a new hashing framework named relational consistency induced self-supervised hashing (RCSH) for large-scale image retrieval. To capture the potential semantic structure of data, RCSH explores the relational consistency between data samples in different spaces, which learns reliable data relationships in the latent feature space and then preserves the learned relationships in the Hamming space. The data relationships are uncovered by learning a set of prototypes that group similar data samples in the latent feature space. By uncovering the semantic structure of the data, meaningful data-to-prototype and data-to-data relationships are jointly constructed. The data-to-prototype relationships are captured by constraining the prototype assignments generated from different augmented views of an image to be the same. Meanwhile, these data-to-prototype relationships are preserved to learn informative compact hash codes by matching them with these reliable prototypes. To accomplish this, a novel dual prototype contrastive loss is proposed to maximize the agreement of prototype assignments in the latent feature space and Hamming space. The data-to-data relationships are captured by enforcing the distribution of pairwise similarities in the latent feature space and Hamming space to be consistent, which makes the learned hash codes preserve meaningful similarity relationships. Extensive experimental results on four widely used image retrieval datasets demonstrate that the proposed method significantly outperforms the state-of-the-art methods. Besides, the proposed method achieves promising performance in out-of-domain retrieval tasks, which shows its good generalization ability. The source code and models are available at https://github.com/IMAG-LuJin/RCSH.
Collapse
|
2
|
Bai Y, Liu D, Zhang L, Wu H. A Low-Measurement-Cost-Based Multi-Strategy Hyperspectral Image Classification Scheme. SENSORS (BASEL, SWITZERLAND) 2024; 24:6647. [PMID: 39460127 PMCID: PMC11511204 DOI: 10.3390/s24206647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Revised: 10/10/2024] [Accepted: 10/14/2024] [Indexed: 10/28/2024]
Abstract
The cost of hyperspectral image (HSI) classification primarily stems from the annotation of image pixels. In real-world classification scenarios, the measurement and annotation process is both time-consuming and labor-intensive. Therefore, reducing the number of labeled pixels while maintaining classification accuracy is a key research focus in HSI classification. This paper introduces a multi-strategy triple network classifier (MSTNC) to address the issue of limited labeled data in HSI classification by improving learning strategies. First, we use the contrast learning strategy to design a lightweight triple network classifier (TNC) with low sample dependence. Due to the construction of triple sample pairs, the number of labeled samples can be increased, which is beneficial for extracting intra-class and inter-class features of pixels. Second, an active learning strategy is used to label the most valuable pixels, improving the quality of the labeled data. To address the difficulty of sampling effectively under extremely limited labeling budgets, we propose a new feature-mixed active learning (FMAL) method to query valuable samples. Fine-tuning is then used to help the MSTNC learn a more comprehensive feature distribution, reducing the model's dependence on accuracy when querying samples. Therefore, the sample quality is improved. Finally, we propose an innovative dual-threshold pseudo-active learning (DSPAL) strategy, filtering out pseudo-label samples with both high confidence and uncertainty. Extending the training set without increasing the labeling cost further improves the classification accuracy of the model. Extensive experiments are conducted on three benchmark HSI datasets. Across various labeling ratios, the MSTNC outperforms several state-of-the-art methods. In particular, under extreme small-sample conditions (five samples per class), the overall accuracy reaches 82.97% (IP), 87.94% (PU), and 86.57% (WHU).
Collapse
Affiliation(s)
| | | | - Lili Zhang
- Electronic and Information Engineering, Shenyang Aerospace University, Shenyang 110136, China
| | | |
Collapse
|
3
|
Fast unsupervised consistent and modality-specific hashing for multimedia retrieval. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-08008-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
4
|
Li Q, Tian X, Ng WW, Pelillo M. Hashing-based affinity matrix for dominant set clustering. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.06.067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
5
|
Semi-Supervised Cross-Modal Hashing with Multi-view Graph Representation. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.05.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
6
|
Zou Q, Cao L, Zhang Z, Chen L, Wang S. Transductive Zero-Shot Hashing for Multilabel Image Retrieval. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:1673-1687. [PMID: 33361006 DOI: 10.1109/tnnls.2020.3043298] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Hash coding has been widely used in the approximate nearest neighbor search for large-scale image retrieval. Given semantic annotations such as class labels and pairwise similarities of the training data, hashing methods can learn and generate effective and compact binary codes. While some newly introduced images may contain undefined semantic labels, which we call unseen images, zero-shot hashing (ZSH) techniques have been studied for retrieval. However, existing ZSH methods mainly focus on the retrieval of single-label images and cannot handle multilabel ones. In this article, for the first time, a novel transductive ZSH method is proposed for multilabel unseen image retrieval. In order to predict the labels of the unseen/target data, a visual-semantic bridge is built via instance-concept coherence ranking on the seen/source data. Then, pairwise similarity loss and focal quantization loss are constructed for training a hashing model using both the seen/source and unseen/target data. Extensive evaluations on three popular multilabel data sets demonstrate that the proposed hashing method achieves significantly better results than the comparison methods.
Collapse
|
7
|
Monowar MM, Hamid MA, Ohi AQ, Alassafi MO, Mridha MF. AutoRet: A Self-Supervised Spatial Recurrent Network for Content-Based Image Retrieval. SENSORS 2022; 22:s22062188. [PMID: 35336358 PMCID: PMC8954462 DOI: 10.3390/s22062188] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 03/02/2022] [Accepted: 03/08/2022] [Indexed: 02/05/2023]
Abstract
Image retrieval techniques are becoming famous due to the vast availability of multimedia data. The present image retrieval system performs excellently on labeled data. However, often, data labeling becomes costly and sometimes impossible. Therefore, self-supervised and unsupervised learning strategies are currently becoming illustrious. Most of the self/unsupervised strategies are sensitive to the number of classes and can not mix labeled data on availability. In this paper, we introduce AutoRet, a deep convolutional neural network (DCNN) based self-supervised image retrieval system. The system is trained on pairwise constraints. Therefore, it can work in self-supervision and can also be trained on a partially labeled dataset. The overall strategy includes a DCNN that extracts embeddings from multiple patches of images. Further, the embeddings are fused for quality information used for the image retrieval process. The method is benchmarked with three different datasets. From the overall benchmark, it is evident that the proposed method works better in a self-supervised manner. In addition, the evaluation exhibits the proposed method’s performance to be highly convincing while a small portion of labeled data are mixed on availability.
Collapse
Affiliation(s)
- Muhammad Mostafa Monowar
- Department of Information Technology, Faculty of Computing & Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia; (M.A.H.); (M.O.A.)
- Correspondence:
| | - Md. Abdul Hamid
- Department of Information Technology, Faculty of Computing & Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia; (M.A.H.); (M.O.A.)
| | - Abu Quwsar Ohi
- Department of Computer Science & Engineering, Bangladesh University of Business & Technology, Dhaka 1216, Bangladesh;
| | - Madini O. Alassafi
- Department of Information Technology, Faculty of Computing & Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia; (M.A.H.); (M.O.A.)
| | - M. F. Mridha
- Department of Computer Science, American International University-Bangladesh, Dhaka 1229, Bangladesh;
| |
Collapse
|
8
|
Fu X, Yang N, Ji J. Application of CT images based on the optimal atlas segmentation algorithm in the clinical diagnosis of Mycoplasma Pneumoniae Pneumonia in Children. Pak J Med Sci 2021; 37:1647-1651. [PMID: 34712299 PMCID: PMC8520366 DOI: 10.12669/pjms.37.6-wit.4860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Revised: 06/12/2021] [Accepted: 07/08/2021] [Indexed: 11/15/2022] Open
Abstract
Objective Use of optimal Atlas segmentation algorithm to study the imaging signs of mycoplasma pneumonia with multi-slice spiral CT (HRCT), and to explore the value of HRCT in the diagnosis and efficacy in evaluation of mycoplasma pneumonia in children. Methods The study retrospectively analyzed 72 patients diagnosed with mycoplasma pneumonia in our hospital from January 2017 to January 2019. The imaging data and clinical data of 72 patients were collected. The optimal Atlas segmentation algorithm was used to analyze the characteristics of CT examination, and the value of CT in the diagnosis of mycoplasma pneumonia and the evaluation of curative effect was summarized. Results Among all patients, 37 cases were unilateral lesions, 35 cases were bilateral lesions, 19 cases were in the left upper lobe, 24 cases were in the left lower lobe, 21 cases were in the right upper lobe, 13 cases were in the right middle lobe, 25 The lesion was located in the right lower lobe. The main CT findings of the lesions before treatment were large patchy, spot-shaped shadows, and strip-shaped or ground-glass shadows. After treatment, the main CT findings of the lesions were reduced lesion density and reduced lesion range. Conclusion CT can clearly show the pulmonary lesions of mycoplasma pneumonia, and its unique imaging signs can improve the clinical diagnosis accuracy. In addition, CT scans can evaluate the treatment effect according to the changes in the characteristics of the lesion, which has important value for the evaluation of the effect for clinical diagnosis and efficacy evaluation of mycoplasma pneumonia.
Collapse
Affiliation(s)
- Xilin Fu
- Xilin Fu, Attending Physician, Department of Pediatrics, Yiwu Central Hospital, Yiwu, 322000, China
| | - Ningfei Yang
- Ningfei Yang, Attending Physician, Department of Pediatrics, Yiwu Central Hospital, Yiwu, 322000, China
| | - Jianwei Ji
- Jianwei Ji, Attending Physician, Department of Pediatrics, Yiwu Central Hospital, Yiwu, 322000, China
| |
Collapse
|
9
|
Shen X, Zhang H, Li L, Zhang Z, Chen D, Liu L. Clustering-driven Deep Adversarial Hashing for scalable unsupervised cross-modal retrieval. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.06.087] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
10
|
Yang Z, Yang L, Huang W, Sun L, Long J. Enhanced Deep Discrete Hashing with semantic-visual similarity for image retrieval. Inf Process Manag 2021. [DOI: 10.1016/j.ipm.2021.102648] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
11
|
Qin Q, Huang L, Wei Z, Nie J, Xie K, Hou J. Unsupervised Deep Quadruplet Hashing with Isometric Quantization for image retrieval. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.03.006] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
12
|
|
13
|
Liu G, Li X, Wei J. Large-area damage image restoration algorithm based on generative adversarial network. Neural Comput Appl 2021. [DOI: 10.1007/s00521-020-05308-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
14
|
Yang Z, Yang L, Raymond OI, Zhu L, Huang W, Liao Z, Long J. NSDH: A Nonlinear Supervised Discrete Hashing framework for large-scale cross-modal retrieval. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.106818] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
15
|
|
16
|
Wang W, Shen Y, Zhang H, Liu L. Semantic-rebased cross-modal hashing for scalable unsupervised text-visual retrieval. Inf Process Manag 2020. [DOI: 10.1016/j.ipm.2020.102374] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
17
|
Scalable deep asymmetric hashing via unequal-dimensional embeddings for image similarity search. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.06.036] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
18
|
Wu G, Lin Z, Ding G, Ni Q, Han J. On Aggregation of Unsupervised Deep Binary Descriptor with Weak Bits. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; PP:9266-9278. [PMID: 32976101 DOI: 10.1109/tip.2020.3025437] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Despite the thrilling success achieved by existing binary descriptors, most of them are still in the mire of three limitations: 1) vulnerable to the geometric transformations; 2) incapable of preserving the manifold structure when learning binary codes; 3) NO guarantee to find the true match if multiple candidates happen to have the same Hamming distance to a given query. All these together make the binary descriptor less effective, given large-scale visual recognition tasks. In this paper, we propose a novel learning-based feature descriptor, namely Unsupervised Deep Binary Descriptor (UDBD), which learns transformation invariant binary descriptors via projecting the original data and their transformed sets into a joint binary space. Moreover, we involve a ℓ2,1-norm loss term in the binary embedding process to gain simultaneously the robustness against data noises and less probability of mistakenly flipping bits of the binary descriptor, on top of it, a graph constraint is used to preserve the original manifold structure in the binary space. Furthermore, a weak bit mechanism is adopted to find the real match from candidates sharing the same minimum Hamming distance, thus enhancing matching performance. Extensive experimental results on public datasets show the superiority of UDBD in terms of matching and retrieval accuracy over state-of-the-arts.
Collapse
|
19
|
|
20
|
Lu X, Chen Y, Li X. Siamese Dilated Inception Hashing With Intra-Group Correlation Enhancement for Image Retrieval. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:3032-3046. [PMID: 31514159 DOI: 10.1109/tnnls.2019.2935118] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
For large-scale image retrieval, hashing has been extensively explored in approximate nearest neighbor search methods due to its low storage and high computational efficiency. With the development of deep learning, deep hashing methods have made great progress in image retrieval. Most existing deep hashing methods cannot fully consider the intra-group correlation of hash codes, which leads to the correlation decrease problem of similar hash codes and ultimately affects the retrieval results. In this article, we propose an end-to-end siamese dilated inception hashing (SDIH) method that takes full advantage of multi-scale contextual information and category-level semantics to enhance the intra-group correlation of hash codes for hash codes learning. First, a novel siamese inception dilated network architecture is presented to generate hash codes with the intra-group correlation enhancement by exploiting multi-scale contextual information and category-level semantics simultaneously. Second, we propose a new regularized term, which can force the continuous values to approximate discrete values in hash codes learning and eventually reduces the discrepancy between the Hamming distance and the Euclidean distance. Finally, experimental results in five public data sets demonstrate that SDIH can outperform other state-of-the-art hashing algorithms.
Collapse
|
21
|
Fakhr MW, Emara MM, Abdelhalim MB. Bagging trees with Siamese-twin neural network hashing versus unhashed features for unsupervised image retrieval. Neural Comput Appl 2020. [DOI: 10.1007/s00521-018-3684-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
22
|
Li Z, Tang J, Zhang L, Yang J. Weakly-supervised Semantic Guided Hashing for Social Image Retrieval. Int J Comput Vis 2020. [DOI: 10.1007/s11263-020-01331-0] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
23
|
|
24
|
Gu Y, Wang S, Zhang H, Yao Y, Yang W, Liu L. Clustering-driven unsupervised deep hashing for image retrieval. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.08.050] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
25
|
|
26
|
Ye M, Li J, Ma AJ, Zheng L, Yuen PC. Dynamic Graph Co-Matching for Unsupervised Video-based Person Re-Identification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:2976-2990. [PMID: 30640612 DOI: 10.1109/tip.2019.2893066] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Cross-camera label estimation from a set of unlabelled training data is an extremely important component in unsupervised person re-identification (re-ID) systems. With the estimated labels, existing advanced supervised learning methods can be leveraged to learn discriminative re-ID models. In this paper, we utilize the graph matching technique for accurate label estimation due to its advantages in optimal global matching and intra-camera relationship mining. However, the graph structure constructed with non-learnt similarity measurement cannot handle the large cross-camera variations, which leads to noisy and inaccurate label outputs. This paper designs a Dynamic Graph Matching (DGM) framework, which improves the label estimation process by iteratively refining the graph structure with better similarity measurement learnt from intermediate estimated labels. In addition, we design a positive re-weighting strategy to refine the intermediate labels, which enhances the robustness against inaccurate matching output and noisy initial training data. To fully utilize the abundant video information and reduce false matchings, a co-matching strategy is further incorporated into the framework. Comprehensive experiments conducted on three video benchmarks demonstrate that DGM outperforms state-of-the-art unsupervised re-ID methods and yields competitive performance to fully supervised upper bounds.
Collapse
|
27
|
Long Y, Guan Y, Shao L. Generic compact representation through visual-semantic ambiguity removal. Pattern Recognit Lett 2019. [DOI: 10.1016/j.patrec.2018.04.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
28
|
Zhang H, Long Y, Shao L. Zero-shot Hashing with orthogonal projection for image retrieval. Pattern Recognit Lett 2019. [DOI: 10.1016/j.patrec.2018.04.011] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
29
|
|
30
|
Wu G, Han J, Guo Y, Liu L, Ding G, Ni Q, Shao L. Unsupervised Deep Video Hashing via Balanced Code for Large-Scale Video Retrieval. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 28:1993-2007. [PMID: 30452370 DOI: 10.1109/tip.2018.2882155] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
This paper proposes a deep hashing framework, namely Unsupervised Deep Video Hashing (UDVH), for largescale video similarity search with the aim to learn compact yet effective binary codes. Our UDVH produces the hash codes in a self-taught manner by jointly integrating discriminative video representation with optimal code learning, where an efficient alternating approach is adopted to optimize the objective function. The key differences from most existing video hashing methods lie in 1) UDVH is an unsupervised hashing method that generates hash codes by cooperatively utilizing feature clustering and a specifically-designed binarization with the original neighborhood structure preserved in the binary space; 2) a specific rotation is developed and applied onto video features such that the variance of each dimension can be balanced, thus facilitating the subsequent quantization step. Extensive experiments performed on three popular video datasets show that UDVH is overwhelmingly better than the state-of-the-arts in terms of various evaluation metrics, which makes it practical in real-world applications.
Collapse
|