1
|
Wei S, Gao Z, Ma C, Zhao Y, Guan W, Chen S. Multiple Information Prompt Learning for Cloth-Changing Person Re-Identification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; PP:801-815. [PMID: 40031159 DOI: 10.1109/tip.2025.3531217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Cloth-changing person re-identification is a subject closer to the real world, which focuses on solving the problem of person re-identification after pedestrians change clothes. The primary challenge in this field is to overcome the complex interplay between intra-class and inter-class variations and to identify features that remain unaffected by changes in appearance. Sufficient data collection for model training would significantly aid in addressing this problem. However, it is challenging to gather diverse datasets in practice. Current methods focus on implicitly learning identity information from the original image or introducing additional auxiliary models, which are largely limited by the quality of the image and the performance of the additional model. To address these issues, inspired by prompt learning, we propose a novel multiple information prompt learning (MIPL) scheme for cloth-changing person ReID, which learns identity robust features through the common prompt guidance of multiple messages. Specifically, the clothing information stripping (CIS) module is designed to decouple the clothing information from the original RGB image features to counteract the influence of clothing appearance. The bio-guided attention (BGA) module is proposed to increase the learning intensity of the model for key information. A dual-length hybrid patch (DHP) module is employed to make the features have diverse coverage to minimize the impact of feature bias. Extensive experiments demonstrate that the proposed method outperforms all state-of-the-art methods on the LTCC, CelebreID, Celeb-reID-light, and CSCC datasets, achieving rank-1 scores of 74.8%, 73.3%, 66.0%, and 88.1%, respectively. When compared to AIM (CVPR23), ACID (TIP23), and SCNet (MM23), MIPL achieves rank-1 improvements of 11.3%, 13.8%, and 7.9%, respectively, on the PRCC dataset.
Collapse
|
2
|
Yuan C, Yang L. An efficient multi-metric learning method by partitioning the metric space. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.01.074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
|
3
|
Yan J, Luo L, Deng C, Huang H. Adaptive Hierarchical Similarity Metric Learning With Noisy Labels. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:1245-1256. [PMID: 37022798 DOI: 10.1109/tip.2023.3242148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Deep Metric Learning (DML) plays a critical role in various machine learning tasks. However, most existing deep metric learning methods with binary similarity are sensitive to noisy labels, which are widely present in real-world data. Since these noisy labels often cause a severe performance degradation, it is crucial to enhance the robustness and generalization ability of DML. In this paper, we propose an Adaptive Hierarchical Similarity Metric Learning method. It considers two noise-insensitive information, i.e., class-wise divergence and sample-wise consistency. Specifically, class-wise divergence can effectively excavate richer similarity information beyond binary in modeling by taking advantage of Hyperbolic metric learning, while sample-wise consistency can further improve the generalization ability of the model using contrastive augmentation. More importantly, we design an adaptive strategy to integrate this information in a unified view. It is noteworthy that the new method can be extended to any pair-based metric loss. Extensive experimental results on benchmark datasets demonstrate that our method achieves state-of-the-art performance compared with current deep metric learning approaches.
Collapse
|
4
|
Wang C, Wang X, Wang Z, Zhu W, Hu R. COVID-19 contact tracking by group activity trajectory recovery over camera networks. PATTERN RECOGNITION 2022; 132:108908. [PMID: 35873066 PMCID: PMC9290376 DOI: 10.1016/j.patcog.2022.108908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Revised: 07/14/2022] [Accepted: 07/16/2022] [Indexed: 05/03/2023]
Abstract
Contact tracking plays an important role in the epidemiological investigation of COVID-19, which can effectively reduce the spread of the epidemic. As an excellent alternative method for contact tracking, mobile phone location-based methods are widely used for locating and tracking contacts. However, current inaccurate positioning algorithms that are widely used in contact tracking lead to the inaccurate follow-up of contacts. Aiming to achieve accurate contact tracking for the COVID-19 contact group, we extend the analysis of the GPS data to combine GPS data with video surveillance data and address a novel task named group activity trajectory recovery. Meanwhile, a new dataset called GATR-GPS is constructed to simulate a realistic scenario of COVID-19 contact tracking, and a coordinated optimization algorithm with a spatio-temporal constraint table is further proposed to realize efficient trajectory recovery of pedestrian trajectories. Extensive experiments on the novel collected dataset and commonly used two existing person re-identification datasets are performed, and the results evidently demonstrate that our method achieves competitive results compared to the state-of-the-art methods.
Collapse
Affiliation(s)
- Chao Wang
- National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University, Wuhan 430072, China
- Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan University, Wuhan 430072, China
| | - XiaoChen Wang
- National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Zhongyuan Wang
- National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University, Wuhan 430072, China
| | - WenQian Zhu
- National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University, Wuhan 430072, China
- Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan University, Wuhan 430072, China
| | - Ruimin Hu
- National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University, Wuhan 430072, China
- Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan University, Wuhan 430072, China
- Collaborative Innovation Center of Geospatial Technology, Wuhan 430079, China
| |
Collapse
|
5
|
Kernel Embedding Transformation Learning for Graph Matching. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.09.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
6
|
Ren Q, Yuan C, Zhao Y, Yang L. A novel metric learning framework by exploiting global and local information. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.08.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
7
|
Bhatia H, Thiagarajan JJ, Anirudh R, TS J, Oppelstrup T, Ingólfsson HI, Lightstone F, Bremer PT. A Biology-Informed Similarity Metric for Simulated Patches of Human Cell Membrane. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2022. [DOI: 10.1088/2632-2153/ac8523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Abstract
Complex scientific inquiries rely increasingly upon large and autonomous multiscale simulation campaigns, which fundamentally require similarity metrics to quantify “sufficient” changes among data and/or configurations. However, subject matter experts are often unable to articulate similarity precisely or in terms of well-formulated definitions, especially when new hypotheses are to be explored, making it challenging to design a meaningful metric. Furthermore, the key to practical usefulness of such metrics to enable autonomous simulations lies in in situ inference, which requires generalization to possibly substantial distributional shifts in unseen, future data. Here, we address these challenges in a cancer biology application and develop a meaningful similarity metric for “patches”— regions of simulated human cell membrane that express interactions between certain proteins of interest and relevant lipids. In the absence of well-defined conditions for similarity, we leverage several biology-informed notions about data and the underlying simulations to impose inductive biases on our metric learning framework, resulting in a suitable similarity metric that also generalizes well to significant distributional shifts encountered during the deployment. We combine these intuitions to organize the learned embedding space in a multiscale manner, which makes the metric robust to incomplete and even contradictory intuitions. Our approach delivers a metric that not only performs well on the conditions used for its development and other relevant criteria, but also learns key spatiotemporal relationships from statistical mechanics without ever being exposed to any such information during training.
Collapse
|
8
|
Swin Transformer Based on Two-Fold Loss and Background Adaptation Re-Ranking for Person Re-Identification. ELECTRONICS 2022. [DOI: 10.3390/electronics11131941] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Person re-identification (Re-ID) aims to identify the same pedestrian from a surveillance video in various scenarios. Existing Re-ID models are biased to learn background appearances when there are many background variations in the pedestrian training set. Thus, pedestrians with the same identity will appear with different backgrounds, which interferes with the Re-ID performance. This paper proposes a swin transformer based on two-fold loss (TL-TransNet) to pay more attention to the semantic information of a pedestrian’s body and preserve valuable background information, thereby reducing the interference of corresponding background appearance. TL-TransNet is supervised by two types of losses (i.e., circle loss and instance loss) during the training phase. In the retrieval phase, DeepLabV3+ as a pedestrian background segmentation model is applied to generate body masks in terms of query and gallery set. The background removal results are generated according to the mask and are used to filter out interfering background information. Subsequently, a background adaptation re-ranking is designed to combine the original information with the background-removed information, which digs out more positive samples with large background deviation. Extensive experiments on two public person Re-ID datasets testify that the proposed method achieves competitive robustness performance in terms of the background variation problem.
Collapse
|
9
|
Zhang C, Chen P, Lei T, Meng H. Triplet interactive attention network for cross-modality person re-identification. Pattern Recognit Lett 2021. [DOI: 10.1016/j.patrec.2021.10.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
10
|
Jia Y, Wu W, Wang R, Hou J, Kwong S. Joint Optimization for Pairwise Constraint Propagation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:3168-3180. [PMID: 32745010 DOI: 10.1109/tnnls.2020.3009953] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Constrained spectral clustering (SC) based on pairwise constraint propagation has attracted much attention due to the good performance. All the existing methods could be generally cast as the following two steps, i.e., a small number of pairwise constraints are first propagated to the whole data under the guidance of a predefined affinity matrix, and the affinity matrix is then refined in accordance with the resulting propagation and finally adopted for SC. Such a stepwise manner, however, overlooks the fact that the two steps indeed depend on each other, i.e., the two steps form a "chicken-egg" problem, leading to suboptimal performance. To this end, we propose a joint PCP model for constrained SC by simultaneously learning a propagation matrix and an affinity matrix. Especially, it is formulated as a bounded symmetric graph regularized low-rank matrix completion problem. We also show that the optimized affinity matrix by our model exhibits an ideal appearance under some conditions. Extensive experimental results in terms of constrained SC, semisupervised classification, and propagation behavior validate the superior performance of our model compared with state-of-the-art methods.
Collapse
|
11
|
|
12
|
Wang K, Wang P, Ding C, Tao D. Batch Coherence-Driven Network for Part-Aware Person Re-Identification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:3405-3418. [PMID: 33651691 DOI: 10.1109/tip.2021.3060909] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Existing part-aware person re-identification methods typically employ two separate steps: namely, body part detection and part-level feature extraction. However, part detection introduces an additional computational cost and is inherently challenging for low-quality images. Accordingly, in this work, we propose a simple framework named Batch Coherence-Driven Network (BCD-Net) that bypasses body part detection during both the training and testing phases while still learning semantically aligned part features. Our key observation is that the statistics in a batch of images are stable, and therefore that batch-level constraints are robust. First, we introduce a batch coherence-guided channel attention (BCCA) module that highlights the relevant channels for each respective part from the output of a deep backbone model. We investigate channel-part correspondence using a batch of training images, then impose a novel batch-level supervision signal that helps BCCA to identify part-relevant channels. Second, the mean position of a body part is robust and consequently coherent between batches throughout the training process. Accordingly, we introduce a pair of regularization terms based on the semantic consistency between batches. The first term regularizes the high responses of BCD-Net for each part on one batch in order to constrain it within a predefined area, while the second encourages the aggregate of BCD-Net's responses for all parts covering the entire human body. The above constraints guide BCD-Net to learn diverse, complementary, and semantically aligned part-level features. Extensive experimental results demonstrate that BCD-Net consistently achieves state-of-the-art performance on four large-scale ReID benchmarks.
Collapse
|
13
|
Suárez JL, García S, Herrera F. A tutorial on distance metric learning: Mathematical foundations, algorithms, experimental analysis, prospects and challenges. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.08.017] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
14
|
Nienkötter A, Jiang X. A lower bound for generalized median based consensus learning using kernel-induced distance functions. Pattern Recognit Lett 2020. [DOI: 10.1016/j.patrec.2020.11.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
15
|
Wang K, Ding C, Maybank SJ, Tao D. CDPM: Convolutional Deformable Part Models for Semantically Aligned Person Re-identification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:3416-3428. [PMID: 31899424 DOI: 10.1109/tip.2019.2959923] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Part-level representations are essential for robust person re-identification. However, common errors that arise during pedestrian detection frequently result in severe misalignment problems for body parts, which degrade the quality of part representations. Accordingly, to deal with this problem, we propose a novel model named Convolutional Deformable Part Models (CDPM). CDPM works by decoupling the complex part alignment procedure into two easier steps: first, a vertical alignment step detects each body part in the vertical direction, with the help of a multi-task learning model; second, a horizontal refinement step based on attention suppresses the background information around each detected body part. Since these two steps are performed orthogonally and sequentially, the difficulty of part alignment is significantly reduced. In the testing stage, CDPM is able to accurately align flexible body parts without any need for outside information. Extensive experimental results demonstrate the effectiveness of the proposed CDPM for part alignment. Most impressively, CDPM achieves state-of-the-art performance on three large-scale datasets: Market-1501, DukeMTMC-ReID, and CUHK03.
Collapse
|
16
|
Song W, Li S, Chang T, Hao A, Zhao Q, Qin H. Context-Interactive CNN for Person Re-Identification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:2860-2874. [PMID: 31751241 DOI: 10.1109/tip.2019.2953587] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Despite growing progresses in recent years, cross-scenario person re-identification remains challenging, mainly due to the pedestrians commonly surrounded by highly-complex environment contexts. In reality, the human perception mechanism could adaptively find proper contextualized spatial-temporal clues towards pedestrian recognition. However, conventional methods fall short in adaptively leveraging the long-term spatial-temporal information due to ever-increasing computational cost. Moreover, CNN-based deep learning methods are hard to conduct optimization due to the non-differentiable property of the built-in context search operation. To ameliorate, this paper proposes a novel Context-Interactive CNN (CI-CNN) to dynamically find both spatial and temporal contexts by embedding multi-task Reinforcement Learning (MTRL). The CI-CNN streamlines the multi-task reinforcement learning by using an actor-critic agent to capture the temporal-spatial context simultaneously, which comprises a context-policy network and a context-critic network. The former network learns policies to determine the optimal spatial context region and temporal sequence range. Based on the inferred temporal-spatial cues, the latter one focuses on the identification task and provides feedback for the policy network. Thus, CI-CNN can simultaneously zoom in/out the perception field in spatial and temporal domain for the context interaction with the environment. By fostering the collaborative interaction between the person and context, our method could achieve outstanding performance on various public benchmarks, which confirms the rationality of our hypothesis, and verifies the effectiveness of our CI-CNN framework.
Collapse
|