1
|
Ma S, Wan Z, Zhang L, Hu B, Zhang J, Zhao X. HFFTrack: Transformer tracking via hybrid frequency features. Neural Netw 2025; 186:107269. [PMID: 39999533 DOI: 10.1016/j.neunet.2025.107269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2024] [Revised: 12/25/2024] [Accepted: 02/10/2025] [Indexed: 02/27/2025]
Abstract
Numerous Transformer-based trackers have emerged due to the powerful global modeling capabilities of the Transformer. Nevertheless, the Transformer is a low-pass filter with insufficient capacity to extract high-frequency features of the target and these features are essential for target location in tracking tasks. To address this issue, this paper proposes a tracking algorithm that utilizes hybrid frequency features, which explores how to improve the performance of the tracker by fusing target multi-frequency features. Specifically, a novel feature extraction network is designed that uses CNN and Transformer to learn the multi-frequency features of the target in stages, taking advantage of both structures and balancing high- and low-frequency information. Secondly, a dual-branch encoder is designed to allow the tracker to capture global information while learning the local features of the target through another branch. Finally, a multi-frequency features fusion network is designed that uses wavelet transform and convolution to fuse high-frequency and low-frequency features. Extensive experimental results demonstrate that our tracker achieves superior tracking performance on six challenging benchmark datasets (i.e., LaSOT, TrackingNet, GOT-10k, TNL2K, UAV123, and OTB100).
Collapse
Affiliation(s)
- Sugang Ma
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an 710121, China; School of Information Engineering, Chang'an University, Xi'an 710064, China.
| | - Zhen Wan
- School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an 710121, China.
| | - Licheng Zhang
- School of Information Engineering, Chang'an University, Xi'an 710064, China; Shaanxi Engineering Research Center of Internet of Vehicles and Intelligent Vehicle Testing Technique, Xi'an 710064, China.
| | - Bin Hu
- Department of Computer Science and Technology, Kean University, Union, NJ 07083, United States of America.
| | - Jinyu Zhang
- School of Electronic Engineering, Xi'an University of Posts and Telecommunications, Xi'an 710121, China.
| | - Xiangmo Zhao
- School of Information Engineering, Chang'an University, Xi'an 710064, China.
| |
Collapse
|
2
|
Hu Z, Shao J, Nie F, Luo Z, Chen C, Xiao L. Robust online learning based on siamese network for ship tracking. Sci Rep 2023; 13:7358. [PMID: 37147360 PMCID: PMC10163256 DOI: 10.1038/s41598-023-32561-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 03/29/2023] [Indexed: 05/07/2023] Open
Abstract
The complex and changeable inland river scenes resulting out of frequent occlusions of ships in the available tracking methods are not accurate enough to estimate the motion state of the target ship leading to object tracking drift or even loss. In view of this, an attempt is made to propose a robust online learning ship tracking algorithm based on the Siamese network and the region proposal network. Firstly, the algorithm combines the off-line Siamese network classification score and the online classifier score for discriminative learning, and establishes an occlusion determination mechanism according to the classification the fusion score. When the target is in the occlusion state, the target template is not updated, and the global search mechanism is activated to relocate the target, thereby avoiding object tracking drift. Secondly, an efficient adaptive online update strategy, UpdateNet, is introduced to improve the template degradation in the tracking process. Finally, on comparing the state-of-the-art tracking algorithms on the inland river ship datasets, the experimental results of the proposed algorithm show strong robustness in occlusion scenarios with an accuracy and success rate of 56.8% and 57.2% respectively. Supportive source codes for this research are publicly available at https://github.com/Libra-jing/SiamOL .
Collapse
Affiliation(s)
- Zhongyi Hu
- Intelligent Information Systems Institute, Wenzhou University, Wenzhou, 325035, China
| | - Jingjing Shao
- Intelligent Information Systems Institute, Wenzhou University, Wenzhou, 325035, China
| | - Feiyan Nie
- Intelligent Information Systems Institute, Wenzhou University, Wenzhou, 325035, China
| | - Zhenzhen Luo
- Intelligent Information Systems Institute, Wenzhou University, Wenzhou, 325035, China
| | - Changzu Chen
- School of Intelligent Manufacturing and Electronic Engineering, Wenzhou University of Technology, Wenzhou, 325088, China
| | - Lei Xiao
- Intelligent Information Systems Institute, Wenzhou University, Wenzhou, 325035, China.
| |
Collapse
|
3
|
Contrastive Learning with Dynamic Weighting and Jigsaw Augmentation for Brain Tumor Classification in MRI. Neural Process Lett 2023. [DOI: 10.1007/s11063-022-11108-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
|
4
|
Hassan E, Shams MY, Hikal NA, Elmougy S. The effect of choosing optimizer algorithms to improve computer vision tasks: a comparative study. MULTIMEDIA TOOLS AND APPLICATIONS 2022; 82:16591-16633. [PMID: 36185324 PMCID: PMC9514986 DOI: 10.1007/s11042-022-13820-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Revised: 06/30/2022] [Accepted: 09/06/2022] [Indexed: 06/16/2023]
Abstract
Optimization algorithms are used to improve model accuracy. The optimization process undergoes multiple cycles until convergence. A variety of optimization strategies have been developed to overcome the obstacles involved in the learning process. Some of these strategies have been considered in this study to learn more about their complexities. It is crucial to analyse and summarise optimization techniques methodically from a machine learning standpoint since this can provide direction for future work in both machine learning and optimization. The approaches under consideration include the Stochastic Gradient Descent (SGD), Stochastic Optimization Descent with Momentum, Rung Kutta, Adaptive Learning Rate, Root Mean Square Propagation, Adaptive Moment Estimation, Deep Ensembles, Feedback Alignment, Direct Feedback Alignment, Adfactor, AMSGrad, and Gravity. prove the ability of each optimizer applied to machine learning models. Firstly, tests on a skin cancer using the ISIC standard dataset for skin cancer detection were applied using three common optimizers (Adaptive Moment, SGD, and Root Mean Square Propagation) to explore the effect of the algorithms on the skin images. The optimal training results from the analysis indicate that the performance values are enhanced using the Adam optimizer, which achieved 97.30% accuracy. The second dataset is COVIDx CT images, and the results achieved are 99.07% accuracy based on the Adam optimizer. The result indicated that the utilisation of optimizers such as SGD and Adam improved the accuracy in training, testing, and validation stages.
Collapse
Affiliation(s)
- Esraa Hassan
- Faculty of Artificial Intelligence, Kafrelsheikh University, Kafrelsheikh, 33516 Egypt
| | - Mahmoud Y. Shams
- Faculty of Artificial Intelligence, Kafrelsheikh University, Kafrelsheikh, 33516 Egypt
| | - Noha A. Hikal
- Department of Information Technology, Faculty of Computers and Information, Mansoura University, Mansoura, 35516 Egypt
| | - Samir Elmougy
- Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura, 35516 Egypt
| |
Collapse
|
5
|
Double Branch Attention Block for Discriminative Representation of Siamese Trackers. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12062897] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Siamese trackers have achieved a good balance between accuracy and efficiency in generic object tracking. However, background distractors cause side effects to the discriminative representation of the target. To suppress the sensitivity of trackers to background distractors, we propose a Double Branch Attention (DBA) block and a Siamese tracker equipped with the DBA block named DBA-Siam. First, the DBA block concatenates channels of multiple layers from two branches of the Siamese framework to obtain rich feature representation. Second, the channel attention is applied to the two concatenated feature blocks to enhance the robust features selectively, thus enhancing the ability to distinguish the target from the complex background. Finally, the DBA block collects the contextual relevance between the Siamese branches and adaptively encodes it into the feature weight of the detection branch for information compensation. Ablation experiments show that the proposed block can enhance the discriminative representation of the target and significantly improve the tracking performance. Results on two popular benchmarks show that DBA-Siam performs favorably against its counterparts. Compared with the advanced algorithm CSTNet, DBA-Siam improves the EAO by 18.9% on VOT2016.
Collapse
|
6
|
Li L, Jin W, Huang Y. Few-shot contrastive learning for image classification and its application to insulator identification. APPL INTELL 2021; 52:6148-6163. [PMID: 34764617 PMCID: PMC8412402 DOI: 10.1007/s10489-021-02769-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/14/2021] [Indexed: 11/06/2022]
Abstract
This paper presents a novel discriminative Few-shot learning architecture based on batch compact loss. Currently, Convolutional Neural Network (CNN) has achieved reasonably good performance in image recognition. Most existing CNN methods facilitate classifiers to learn discriminating patterns to identify existing categories trained with large samples. However, learning to recognize novel categories from a few examples is a challenging task. To address this, we propose the Residual Compact Network to train a deep neural network to learn hierarchical nonlinear transformations to project image pairs into the same latent feature space, under which the distance of each positive pair is reduced. To better use the commonality of class-level features for category recognition, we develop a batch compact loss to form robust feature representations relevant to a category. The proposed methods are evaluated on several datasets. Experimental evaluations show that our proposed method achieves acceptable results in Few-shot learning.
Collapse
Affiliation(s)
- Liang Li
- Southwest Jiaotong University, Chengdu City, Sichuan Province China
| | - Weidong Jin
- Southwest Jiaotong University, Chengdu City, Sichuan Province China
- China-ASEAN International Joint Laboratory of Integrated Transportation, Nanning University, Nanning City, Guangxi Province China
| | - Yingkun Huang
- Southwest Jiaotong University, Chengdu City, Sichuan Province China
| |
Collapse
|
7
|
Yao S, Han X, Zhang H, Wang X, Cao X. Learning Deep Lucas-Kanade Siamese Network for Visual Tracking. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:4814-4827. [PMID: 33945475 DOI: 10.1109/tip.2021.3076272] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
In most recent years, Siamese trackers have drawn great attention because of their well-balanced accuracy and efficiency. Although these approaches have achieved great success, the discriminative power of the conventional Siamese trackers is still limited by the insufficient template-candidate representation. Most of the existing approaches take non-aligned features to learn a similarity function for template-candidate matching, while the target object's geometrical transformation is seldom explored. To address this problem, we propose a novel Siamese tracking framework, which enables to dynamically transform the template-candidate features to a more discriminative viewpoint for similarity matching. Specifically, we reformulate the template-candidate matching problem of the conventional Siamese tracker from the perspective of Lucas-Kanade (LK) image alignment approach. A Lucas-Kanade network (LKNet) is proposed and incorporated to the Siamese architecture to learn aligned feature representations in data-driven trainable manner, which is able to enhance the model adaptability in challenging scenarios. Within this framework, we propose two Siamese trackers named LK-Siam and LK-SiamRPN to validate the effectiveness. Extensive experiments conducted on the prevalent datasets show that the proposed method is more competitive over a number of state-of-the-art methods.
Collapse
|