1
|
Xing P, Tang H, Tang J, Li Z. ADPS: Asymmetric Distillation Postsegmentation for Image Anomaly Detection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:7051-7064. [PMID: 38683707 DOI: 10.1109/tnnls.2024.3390806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2024]
Abstract
Knowledge distillation-based anomaly detection (KDAD) methods rely on the teacher-student paradigm to detect and segment anomalous regions by contrasting the unique features extracted by both networks. However, existing KDAD methods suffer from two main limitations: 1) the student network can effortlessly replicate the teacher network's representations and 2) the features of the teacher network serve solely as a "reference standard" and are not fully leveraged. Toward this end, we depart from the established paradigm and instead propose an innovative approach called asymmetric distillation postsegmentation (ADPS). Our ADPS employs an asymmetric distillation paradigm that takes distinct forms of the same image as the input of the teacher-student networks, driving the student network to learn discriminating representations for anomalous regions. Meanwhile, a customized Weight Mask Block (WMB) is proposed to generate a coarse anomaly localization mask that transfers the distilled knowledge acquired from the asymmetric paradigm to the teacher network. Equipped with WMB, the proposed postsegmentation module (PSM) can effectively detect and segment abnormal regions with fine structures and clear boundaries. Experimental results demonstrate that the proposed ADPS outperforms the state-of-the-art methods in detecting and segmenting anomalies. Surprisingly, ADPS significantly improves average precision (AP) metric by $\mathbf {9}\%$ and $\mathbf {20}\%$ on the MVTec anomaly detection (AD) and KolektorSDD2 datasets, respectively.
Collapse
|
2
|
Wang W, Chang F, Liu C, Wang B, Liu Z. TODO-Net: Temporally Observed Domain Contrastive Network for 3-D Early Action Prediction. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:6122-6133. [PMID: 38743544 DOI: 10.1109/tnnls.2024.3394254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Early action prediction aiming to recognize which classes the actions belong to before they are fully conveyed is a very challenging task, owing to the insufficient discrimination information caused by the domain gaps among different temporally observed domains. Most of the existing approaches focus on using fully observed temporal domains to "guide" the partially observed domains while ignoring the discrepancies between the harder low-observed temporal domains and the easier highly observed temporal domains. The recognition models tend to learn the easier samples from the highly observed temporal domains and may lead to significant performance drops on low-observed temporal domains. Therefore, in this article, we propose a novel temporally observed domain contrastive network, namely, TODO-Net, to explicitly mine the discrimination information from the hard actions samples from the low-observed temporal domains by mitigating the domain gaps among various temporally observed domains for 3-D early action prediction. More specifically, the proposed TODO-Net is able to mine the relationship between the low-observed sequences and all the highly observed sequences belonging to the same action category to boost the recognition performance of the hard samples with fewer observed frames. We also introduce a temporal domain conditioned supervised contrastive (TD-conditioned SupCon) learning scheme to empower our TODO-Net with the ability to minimize the gaps between the temporal domains within the same action categories, meanwhile pushing apart the temporal domains belonging to different action classes. We conduct extensive experiments on two public 3-D skeleton-based activity datasets, and the results show the efficacy of the proposed TODO-Net.
Collapse
|
3
|
Dong B, Chen D, Wu Y, Tang S, Zhuang Y. FADngs: Federated Learning for Anomaly Detection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:2578-2592. [PMID: 38241100 DOI: 10.1109/tnnls.2024.3350660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/21/2024]
Abstract
With the increasing demand for data privacy, federated learning (FL) has gained popularity for various applications. Most existing FL works focus on the classification task, overlooking those scenarios where anomaly detection may also require privacy-preserving. Traditional anomaly detection algorithms cannot be directly applied to the FL setting due to false and missing detection issues. Moreover, with common aggregation methods used in FL (e.g., averaging model parameters), the global model cannot keep the capacities of local models in discriminating anomalies deviating from local distributions, which further degrades the performance. For the aforementioned challenges, we propose Federated Anomaly Detection with Noisy Global Density Estimation, and Self-supervised Ensemble Distillation (FADngs). Specifically, FADngs aligns the knowledge of data distributions from each client by sharing processed density functions. Besides, FADngs trains local models in an improved contrastive learning way that learns more discriminative representations specific for anomaly detection based on the shared density functions. Furthermore, FADngs aggregates capacities by ensemble distillation, which distills the knowledge learned from different distributions to the global model. Our experiments demonstrate that the proposed method significantly outperforms state-of-the-art federated anomaly detection methods. We also empirically show that the shared density function is privacy-preserving. The code for the proposed method is provided for research purposes https://github.com/kanade00/Federated_Anomaly_detection.
Collapse
|
4
|
Bai J, Ren J, Xiao Z, Chen Z, Gao C, Ali TAA, Jiao L. Localizing From Classification: Self-Directed Weakly Supervised Object Localization for Remote Sensing Images. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:17935-17949. [PMID: 37672374 DOI: 10.1109/tnnls.2023.3309889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
In recent years, object localization and detection methods in remote sensing images (RSIs) have received increasing attention due to their broad applications. However, most previous fully supervised methods require a large number of time-consuming and labor-intensive instance-level annotations. Compared with those fully supervised methods, weakly supervised object localization (WSOL) aims to recognize object instances using only image-level labels, which greatly saves the labeling costs of RSIs. In this article, we propose a self-directed weakly supervised strategy (SD-WSS) to perform WSOL in RSIs. To specify, we fully exploit and enhance the spatial feature extraction capability of the RSIs' classification model to accurately localize the objects of interest. To alleviate the serious discriminative region problem exhibited by previous WSOL methods, the spatial location information implicit in the classification model is carefully extracted by GradCAM++ to guide the learning procedure. Furthermore, to eliminate the interference from complex backgrounds of RSIs, we design a novel self-directed loss to make the model optimize itself and explicitly tell it where to look. Finally, we review and annotate the existing remote sensing scene classification dataset and create two new WSOL benchmarks in RSIs, named C45V2 and PN2. We conduct extensive experiments to evaluate the proposed method and six mainstream WSOL methods with three backbones on C45V2 and PN2. The results demonstrate that our proposed method achieves better performance when compared with state-of-the-arts.
Collapse
|
5
|
Ye F, Bors AG. Lifelong Generative Adversarial Autoencoder. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:14684-14698. [PMID: 37410645 DOI: 10.1109/tnnls.2023.3281091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/08/2023]
Abstract
Lifelong learning describes an ability that enables humans to continually acquire and learn new information without forgetting. This capability, common to humans and animals, has lately been identified as an essential function for an artificial intelligence system aiming to learn continuously from a stream of data during a certain period of time. However, modern neural networks suffer from degenerated performance when learning multiple domains sequentially and fail to recognize past learned tasks after being retrained. This corresponds to catastrophic forgetting and is ultimately induced by replacing the parameters associated with previously learned tasks with new values. One approach in lifelong learning is the generative replay mechanism (GRM) that trains a powerful generator as the generative replay network, implemented by a variational autoencoder (VAE) or a generative adversarial network (GAN). In this article, we study the forgetting behavior of GRM-based learning systems by developing a new theoretical framework in which the forgetting process is expressed as an increase in the model's risk during the training. Although many recent attempts have provided high-quality generative replay samples by using GANs, they are limited to mainly downstream tasks due to the lack of inference. Inspired by the theoretical analysis while aiming to address the drawbacks of existing approaches, we propose the lifelong generative adversarial autoencoder (LGAA). LGAA consists of a generative replay network and three inference models, each addressing the inference of a different type of latent variable. The experimental results show that LGAA learns novel visual concepts without forgetting and can be applied to a wide range of downstream tasks.
Collapse
|
6
|
Zaheer MZ, Mahmood A, Astrid M, Lee SI. Clustering Aided Weakly Supervised Training to Detect Anomalous Events in Surveillance Videos. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:14085-14098. [PMID: 37235464 DOI: 10.1109/tnnls.2023.3274611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Formulating learning systems for the detection of real-world anomalous events using only video-level labels is a challenging task mainly due to the presence of noisy labels as well as the rare occurrence of anomalous events in the training data. We propose a weakly supervised anomaly detection system that has multiple contributions including a random batch selection mechanism to reduce interbatch correlation and a normalcy suppression block (NSB) which learns to minimize anomaly scores over normal regions of a video by utilizing the overall information available in a training batch. In addition, a clustering loss block (CLB) is proposed to mitigate the label noise and to improve the representation learning for the anomalous and normal regions. This block encourages the backbone network to produce two distinct feature clusters representing normal and anomalous events. An extensive analysis of the proposed approach is provided using three popular anomaly detection datasets including UCF-Crime, ShanghaiTech, and UCSD Ped2. The experiments demonstrate the superior anomaly detection capability of our approach.
Collapse
|
7
|
Huyan N, Zhang X, Quan D, Chanussot J, Jiao L. AUD-Net: A Unified Deep Detector for Multiple Hyperspectral Image Anomaly Detection via Relation and Few-Shot Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:6835-6849. [PMID: 36301787 DOI: 10.1109/tnnls.2022.3213023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
This article addresses the problem of the building an out-of-the-box deep detector, motivated by the need to perform anomaly detection across multiple hyperspectral images (HSIs) without repeated training. To solve this challenging task, we propose a unified detector [anomaly detection network (AUD-Net)] inspired by few-shot learning. The crucial issues solved by AUD-Net include: how to improve the generalization of the model on various HSIs that contain different categories of land cover; and how to unify the different spectral sizes between HSIs. To achieve this, we first build a series of subtasks to classify the relations between the center and its surroundings in the dual window. Through relation learning, AUD-Net can be more easily generalized to unseen HSIs, as the relations of the pixel pairs are shared among different HSIs. Secondly, to handle different HSIs with various spectral sizes, we propose a pooling layer based on the vector of local aggregated descriptors, which maps the variable-sized features to the same space and acquires the fixed-sized relation embeddings. To determine whether the center of the dual window is an anomaly, we build a memory model by the transformer, which integrates the contextual relation embeddings in the dual window and estimates the relation embeddings of the center. By computing the feature difference between the estimated relation embeddings of the centers and the corresponding real ones, the centers with large differences will be detected as anomalies, as they are more difficult to be estimated by the corresponding surroundings. Extensive experiments on both the simulation dataset and 13 real HSIs demonstrate that this proposed AUD-Net has strong generalization for various HSIs and achieves significant advantages over the specific-trained detectors for each HSI.
Collapse
|
8
|
ASIF HAFIZ, MIN SITAO, WANG XINYUE, VAIDYA JAIDEEP. U.S.-U.K. PETs Prize Challenge: Anomaly Detection via Privacy-Enhanced Federated Learning. IEEE TRANSACTIONS ON PRIVACY 2024; 1:3-18. [PMID: 38979543 PMCID: PMC11229673 DOI: 10.1109/tp.2024.3392721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Privacy Enhancing Technologies (PETs) have the potential to enable collaborative analytics without compromising privacy. This is extremely important for collaborative analytics can allow us to really extract value from the large amounts of data that are collected in domains such as healthcare, finance, and national security, among others. In order to foster innovation and move PETs from the research labs to actual deployment, the U.S. and U.K. governments partnered together in 2021 to propose the PETs prize challenge asking for privacy-enhancing solutions for two of the biggest problems facing us today: financial crime prevention and pandemic response. This article presents the Rutgers ScarletPets privacy-preserving federated learning approach to identify anomalous financial transactions in a payment network system (PNS). This approach utilizes a two-step anomaly detection methodology to solve the problem. In the first step, features are mined based on account-level data and labels, and then a privacy-preserving encoding scheme is used to augment these features to the data held by the PNS. In the second step, the PNS learns a highly accurate classifier from the augmented data. Our proposed approach has two major advantages: 1) there is no noteworthy drop in accuracy between the federated and the centralized setting, and 2) our approach is flexible since the PNS can keep improving its model and features to build a better classifier without imposing any additional computational or privacy burden on the banks. Notably, our solution won the first prize in the US for its privacy, utility, efficiency, and flexibility.
Collapse
Affiliation(s)
- HAFIZ ASIF
- Information Systems and Business Analytics Department, Hofstra University, Hempstead, NY 11549 USA
| | - SITAO MIN
- Management Science and Information Systems Department, Rutgers University, Newark, NJ 08901 USA
| | - XINYUE WANG
- Management Science and Information Systems Department, Rutgers University, Newark, NJ 08901 USA
| | - JAIDEEP VAIDYA
- Management Science and Information Systems Department, Rutgers University, Newark, NJ 08901 USA
| |
Collapse
|
9
|
Kocon M, Malesa M, Rapcewicz J. Ultra-Lightweight Fast Anomaly Detectors for Industrial Applications. SENSORS (BASEL, SWITZERLAND) 2023; 24:161. [PMID: 38203022 PMCID: PMC10781384 DOI: 10.3390/s24010161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 12/12/2023] [Accepted: 12/14/2023] [Indexed: 01/12/2024]
Abstract
Quality inspection in the pharmaceutical and food industry is crucial to ensure that products are safe for the customers. Among the properties that are controlled in the production process are chemical composition, the content of the active substances, and visual appearance. Although the latter may not influence the product's properties, it lowers customers' confidence in drugs or food and affects brand perception. The visual appearance of the consumer goods is typically inspected during the packaging process using machine vision quality inspection systems. In line with the current trends, the processing of the images is often supported with deep neural networks, which increases the accuracy of detection and classification of faults. Solutions based on AI are best suited to production lines with a limited number of formats or highly repeatable production. In the case where formats differ significantly from each other and are often being changed, a quality inspection system has to enable fast training. In this paper, we present a fast method for image anomaly detection that is used in high-speed production lines. The proposed method meets these requirements: It is easy and fast to train, even on devices with limited computing power. The inference time for each production sample is sufficient for real-time scenarios. Additionally, the ultra-lightweight algorithm can be easily adapted to different products and different market segments. In this work, we present the results of our algorithm on three different real production data gathered from food and pharmaceutical industries.
Collapse
Affiliation(s)
| | | | - Jerzy Rapcewicz
- Institute of Automatic Control and Robotics, Warsaw University of Technology, 02-525 Warsaw, Poland
| |
Collapse
|
10
|
Wei D, Zheng J, Qu H. Anomaly detection for blueberry data using sparse autoencoder-support vector machine. PeerJ Comput Sci 2023; 9:e1214. [PMID: 37346526 PMCID: PMC10280483 DOI: 10.7717/peerj-cs.1214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 12/22/2022] [Indexed: 06/23/2023]
Abstract
High-dimensional space includes many subspaces so that anomalies can be hidden in any of them, which leads to obvious difficulties in abnormality detection. Currently, most existing anomaly detection methods tend to measure distances between data points. Unfortunately, the distance between data points becomes more similar as the dimensionality of the input data increases, resulting in difficulties in differentiation between data points. As such, the high dimensionality of input data brings an obvious challenge for anomaly detection. To address this issue, this article proposes a hybrid method of combining a sparse autoencoder with a support vector machine. The principle is that by first using the proposed sparse autoencoder, the low-dimensional features of the input dataset can be captured, so as to reduce its dimensionality. Then, the support vector machine separates abnormal features from normal features in the captured low-dimensional feature space. To improve the precision of separation, a novel kernel is derived based on the Mercer theorem. Meanwhile, to prevent normal points from being mistakenly classified, the upper limit of the number of abnormal points is estimated by the Chebyshev theorem. Experiments on both the synthetic datasets and the UCI datasets show that the proposed method outperforms the state-of-the-art detection methods in the ability of anomaly detection. We find that the newly designed kernel can explore different sub-regions, which is able to better separate anomaly instances from the normal ones. Moreover, our results suggested that anomaly detection models suffer less negative effects from the complexity of data distribution in the space reconstructed by those layered features than in the original space.
Collapse
Affiliation(s)
- Dianwen Wei
- Institute of Natural Resources and Ecology, Heilongjiang Academy of Sciences, Haerbinn, China
| | - Jian Zheng
- College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Hongchun Qu
- College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, China
- College of Automation, Chongqing University of Posts and Telecommunications, Chongqing, China
| |
Collapse
|
11
|
Jiang Z, Xu X, Zhang L, Zhang C, Foo CS, Zhu C. MA-GANet: A Multi-Attention Generative Adversarial Network for Defocus Blur Detection. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:3494-3508. [PMID: 35533163 DOI: 10.1109/tip.2022.3171424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Background clutters pose challenges to defocus blur detection. Existing approaches often produce artifact predictions in background areas with clutter and relatively low confident predictions in boundary areas. In this work, we tackle the above issues from two perspectives. Firstly, inspired by the recent success of self-attention mechanism, we introduce channel-wise and spatial-wise attention modules to attentively aggregate features at different channels and spatial locations to obtain more discriminative features. Secondly, we propose a generative adversarial training strategy to suppress spurious and low reliable predictions. This is achieved by utilizing a discriminator to identify predicted defocus map from ground-truth ones. As such, the defocus network (generator) needs to produce 'realistic' defocus map to minimize discriminator loss. We further demonstrate that the generative adversarial training allows exploiting additional unlabeled data to improve performance, a.k.a. semi-supervised learning, and we provide the first benchmark on semi-supervised defocus detection. Finally, we demonstrate that the existing evaluation metrics for defocus detection generally fail to quantify the robustness with respect to thresholding. For a fair and practical evaluation, we introduce an effective yet efficient AUFβ metric. Extensive experiments on three public datasets verify the superiority of the proposed methods compared against state-of-the-art approaches.
Collapse
|
12
|
Mujkic E, Philipsen MP, Moeslund TB, Christiansen MP, Ravn O. Anomaly Detection for Agricultural Vehicles Using Autoencoders. SENSORS (BASEL, SWITZERLAND) 2022; 22:3608. [PMID: 35632017 PMCID: PMC9145690 DOI: 10.3390/s22103608] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Revised: 05/01/2022] [Accepted: 05/07/2022] [Indexed: 06/15/2023]
Abstract
The safe in-field operation of autonomous agricultural vehicles requires detecting all objects that pose a risk of collision. Current vision-based algorithms for object detection and classification are unable to detect unknown classes of objects. In this paper, the problem is posed as anomaly detection instead, where convolutional autoencoders are applied to identify any objects deviating from the normal pattern. Training an autoencoder network to reconstruct normal patterns in agricultural fields makes it possible to detect unknown objects by high reconstruction error. Basic autoencoder (AE), vector-quantized variational autoencoder (VQ-VAE), denoising autoencoder (DAE) and semisupervised autoencoder (SSAE) with a max-margin-inspired loss function are investigated and compared with a baseline object detector based on YOLOv5. Results indicate that SSAE with an area under the curve for precision/recall (PR AUC) of 0.9353 outperforms other autoencoder models and is comparable to an object detector with a PR AUC of 0.9794. Qualitative results show that SSAE is capable of detecting unknown objects, whereas the object detector is unable to do so and fails to identify known classes of objects in specific cases.
Collapse
Affiliation(s)
- Esma Mujkic
- Automation and Control Group, Department of Electrical Engineering, Technical University of Denmark, 2800 Kongens Lyngby, Denmark;
- AGCO A/S, 8930 Randers, Denmark;
| | - Mark P. Philipsen
- Visual Analysis and Perception Lab, Department of Architecture, Design, and Media Technology, Aalborg University, 9000 Aalborg, Denmark; (M.P.P.); (T.B.M.)
| | - Thomas B. Moeslund
- Visual Analysis and Perception Lab, Department of Architecture, Design, and Media Technology, Aalborg University, 9000 Aalborg, Denmark; (M.P.P.); (T.B.M.)
| | | | - Ole Ravn
- Automation and Control Group, Department of Electrical Engineering, Technical University of Denmark, 2800 Kongens Lyngby, Denmark;
| |
Collapse
|