1
|
Bortolussi L, Carbone G, Laurenti L, Patane A, Sanguinetti G, Wicker M. On the Robustness of Bayesian Neural Networks to Adversarial Attacks. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:6679-6692. [PMID: 38648123 DOI: 10.1109/tnnls.2024.3386642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/25/2024]
Abstract
Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, training deep learning models robust to adversarial attacks is still an open problem. In this article, we analyse the geometry of adversarial attacks in the over-parameterized limit for Bayesian neural networks (BNNs). We show that, in the limit, vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution, i.e., when the data lie on a lower dimensional submanifold of the ambient space. As a direct consequence, we demonstrate that in this limit, BNN posteriors are robust to gradient-based adversarial attacks. Crucially, by relying on the convergence of infinitely-wide BNNs to Gaussian processes (GPs), we prove that, under certain relatively mild assumptions, the expected gradient of the loss with respect to the BNN posterior distribution is vanishing, even when each NN sampled from the BNN posterior does not have vanishing gradients. The experimental results on the MNIST, Fashion MNIST, and a synthetic dataset with BNNs trained with Hamiltonian Monte Carlo and variational inference support this line of arguments, empirically showing that BNNs can display both high accuracy on clean data and robustness to both gradient-based and gradient-free adversarial attacks.
Collapse
|
2
|
Yu C, Chen J, Wang Y, Xue Y, Ma H. Improving Adversarial Robustness Against Universal Patch Attacks Through Feature Norm Suppressing. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:1410-1424. [PMID: 37917525 DOI: 10.1109/tnnls.2023.3326871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/04/2023]
Abstract
Universal adversarial patch attacks, which are readily implemented, have been validated to be able to fool real-world deep convolutional neural networks (CNNs), posing a serious threat to practical computer vision systems based on CNNs. Unfortunately, current defending approaches are severely understudied facing the following problems. Patch detection-based methods suffer from dramatic performance drops against white-box or adaptive attacks since they rely heavily on empirical clues. Methods based on adversarial training or certified defense are difficult to be scaled up to large-scale datasets or complex practical networks due to prohibitively high computational overhead or over strong assumptions on the network structure. In this article, we focus on two cases of widely adopted universal adversarial patch attacks, namely the universal targeted attack on image classifiers and the universal vanishing attack on object detectors. We find that, for popular CNNs, the attacking success of the adversarial patch relies on feature vectors centered at the patch location with large norm in classifiers and large channel-aware norm (CA-Norm) in detectors, and further present a mathematical explanation for this phenomenon. Based on this, we propose a simple but effective defending method using the feature norm suppressing (FNS) layer, which can renormalize the feature norm by nonincreasing functions. As a differentiable module, FNS can be adaptively inserted in various CNN architectures to achieve multistage suppression of the generation of large norm feature vectors. Moreover, FNS is efficient with no trainable parameters and very low computational overhead. We evaluate our proposed defending method across multiple CNN architectures and datasets against the strong adaptive white-box attacks in both visual classification and detection tasks. In both tasks, FNS significantly outperforms previous defending methods on adversarial robustness with a relatively low influence on the performance of benign images. Code is available at https://github.com/jschenthu/FNS.
Collapse
|
3
|
Liu H, Ge Z, Zhou Z, Shang F, Liu Y, Jiao L. Gradient Correction for White-Box Adversarial Attacks. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:18419-18430. [PMID: 37819820 DOI: 10.1109/tnnls.2023.3315414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]
Abstract
Deep neural networks (DNNs) play key roles in various artificial intelligence applications such as image classification and object recognition. However, a growing number of studies have shown that there exist adversarial examples in DNNs, which are almost imperceptibly different from the original samples but can greatly change the output of DNNs. Recently, many white-box attack algorithms have been proposed, and most of the algorithms concentrate on how to make the best use of gradients per iteration to improve adversarial performance. In this article, we focus on the properties of the widely used activation function, rectified linear unit (ReLU), and find that there exist two phenomena (i.e., wrong blocking and over transmission) misguiding the calculation of gradients for ReLU during backpropagation. Both issues enlarge the difference between the predicted changes of the loss function from gradients and corresponding actual changes and misguide the optimized direction, which results in larger perturbations. Therefore, we propose a universal gradient correction adversarial example generation method, called ADV-ReLU, to enhance the performance of gradient-based white-box attack algorithms such as fast gradient signed method (FGSM), iterative FGSM (I-FGSM), momentum I-FGSM (MI-FGSM), and variance tuning MI-FGSM (VMI-FGSM). Through backpropagation, our approach calculates the gradient of the loss function with respect to the network input, maps the values to scores, and selects a part of them to update the misguided gradients. Comprehensive experimental results on ImageNet and CIFAR10 demonstrate that our ADV-ReLU can be easily integrated into many state-of-the-art gradient-based white-box attack algorithms, as well as transferred to black-box attacks, to further decrease perturbations measured in the -norm.
Collapse
|
4
|
Rossolini G, Nesti F, D'Amico G, Nair S, Biondi A, Buttazzo G. On the Real-World Adversarial Robustness of Real-Time Semantic Segmentation Models for Autonomous Driving. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:18328-18342. [PMID: 37782588 DOI: 10.1109/tnnls.2023.3314512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/04/2023]
Abstract
The existence of real-world adversarial examples (RWAEs) (commonly in the form of patches) poses a serious threat for the use of deep learning models in safety-critical computer vision tasks such as visual perception in autonomous driving. This article presents an extensive evaluation of the robustness of semantic segmentation (SS) models when attacked with different types of adversarial patches, including digital, simulated, and physical ones. A novel loss function is proposed to improve the capabilities of attackers in inducing a misclassification of pixels. Also, a novel attack strategy is presented to improve the expectation over transformation (EOT) method for placing a patch in the scene. Finally, a state-of-the-art method for detecting adversarial patch is first extended to cope with SS models, then improved to obtain real-time performance, and eventually evaluated in real-world scenarios. Experimental results reveal that even though the adversarial effect is visible with both digital and real-world attacks, its impact is often spatially confined to areas of the image around the patch. This opens to further questions about the spatial robustness of real-time SS models.
Collapse
|
5
|
Qi B, Zhou B, Zhang W, Liu J, Wu L. Improving Robustness of Intent Detection Under Adversarial Attacks: A Geometric Constraint Perspective. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:6133-6144. [PMID: 37566500 DOI: 10.1109/tnnls.2023.3267460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/13/2023]
Abstract
Deep neural networks (DNNs)-based natural language processing (NLP) systems are vulnerable to being fooled by adversarial examples presented in recent studies. Intent detection tasks in dialog systems are no exception, however, relatively few works have been attempted on the defense side. The combination of linear classifier and softmax is widely used in most defense methods for other NLP tasks. Unfortunately, it does not encourage the model to learn well-separated feature representations. Thus, it is easy to induce adversarial examples. In this article, we propose a simple, yet efficient defense method from the geometric constraint perspective. Specifically, we first propose an M-similarity metric to shrink variances of intraclass features. Intuitively, better geometric conditions of feature space can bring lower misclassification probability (MP). Therefore, we derive the optimal geometric constraints of anchors within each category from the overall MP (OMP) with theoretical guarantees. Due to the nonconvex characteristic of the optimal geometric condition, it is hard to satisfy the traditional optimization process. To this end, we regard such geometric constraints as manifold optimization processes in the Stiefel manifold, thus naturally avoiding the above challenges. Experimental results demonstrate that our method can significantly improve robustness compared with baselines, while retaining the excellent performance on normal examples.
Collapse
|
6
|
Liu F, Tian J, Miranda-Moreno L, Sun L. Adversarial Danger Identification on Temporally Dynamic Graphs. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:4744-4755. [PMID: 37028290 DOI: 10.1109/tnnls.2023.3252175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Multivariate time series forecasting plays an increasingly critical role in various applications, such as power management, smart cities, finance, and healthcare. Recent advances in temporal graph neural networks (GNNs) have shown promising results in multivariate time series forecasting due to their ability to characterize high-dimensional nonlinear correlations and temporal patterns. However, the vulnerability of deep neural networks (DNNs) constitutes serious concerns about using these models to make decisions in real-world applications. Currently, how to defend multivariate forecasting models, especially temporal GNNs, is overlooked. The existing adversarial defense studies are mostly in static and single-instance classification domains, which cannot apply to forecasting due to the generalization challenge and the contradiction issue. To bridge this gap, we propose an adversarial danger identification method for temporally dynamic graphs to effectively protect GNN-based forecasting models. Our method consists of three steps: 1) a hybrid GNN-based classifier to identify dangerous times; 2) approximate linear error propagation to identify the dangerous variates based on the high-dimensional linearity of DNNs; and 3) a scatter filter controlled by the two identification processes to reform time series with reduced feature erasure. Our experiments, including four adversarial attack methods and four state-of-the-art forecasting models, demonstrate the effectiveness of the proposed method in defending forecasting models against adversarial attacks.
Collapse
|
7
|
McCarthy A, Ghadafi E, Andriotis P, Legg P. Defending against adversarial machine learning attacks using hierarchical learning: A case study on network traffic attack classification. JOURNAL OF INFORMATION SECURITY AND APPLICATIONS 2023. [DOI: 10.1016/j.jisa.2022.103398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
8
|
CAMA: Class activation mapping disruptive attack for deep neural networks. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.05.065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
9
|
Li X, Jiang Y, Liu C, Liu S, Luo H, Yin S. Playing Against Deep-Neural-Network-Based Object Detectors: A Novel Bidirectional Adversarial Attack Approach. IEEE TRANSACTIONS ON ARTIFICIAL INTELLIGENCE 2022; 3:20-28. [DOI: 10.1109/tai.2021.3107807] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/03/2024]
Affiliation(s)
- Xiang Li
- Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, China
| | - Yuchen Jiang
- Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, China
| | - Chenglin Liu
- Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, China
| | - Shaochong Liu
- Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, China
| | - Hao Luo
- Department of Control Science and Engineering, Harbin Institute of Technology, Harbin, China
| | - Shen Yin
- Department of Mechanical and Industrial Engineering, Faculty of Engineering, Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
10
|
A Survey on Data-Driven Learning for Intelligent Network Intrusion Detection Systems. ELECTRONICS 2022. [DOI: 10.3390/electronics11020213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
An effective anomaly-based intelligent IDS (AN-Intel-IDS) must detect both known and unknown attacks. Hence, there is a need to train AN-Intel-IDS using dynamically generated, real-time data in an adversarial setting. Unfortunately, the public datasets available to train AN-Intel-IDS are ineluctably static, unrealistic, and prone to obsolescence. Further, the need to protect private data and conceal sensitive data features has limited data sharing, thus encouraging the use of synthetic data for training predictive and intrusion detection models. However, synthetic data can be unrealistic and potentially bias. On the other hand, real-time data are realistic and current; however, it is inherently imbalanced due to the uneven distribution of anomalous and non-anomalous examples. In general, non-anomalous or normal examples are more frequent than anomalous or attack examples, thus leading to skewed distribution. While imbalanced data are commonly predominant in intrusion detection applications, it can lead to inaccurate predictions and degraded performance. Furthermore, the lack of real-time data produces potentially biased models that are less effective in predicting unknown attacks. Therefore, training AN-Intel-IDS using imbalanced and adversarial learning is instrumental to their efficacy and high performance. This paper investigates imbalanced learning and adversarial learning for training AN-Intel-IDS using a qualitative study. It surveys and synthesizes generative-based data augmentation techniques for addressing the uneven data distribution and generative-based adversarial techniques for generating synthetic yet realistic data in an adversarial setting using rapid review, structured reporting, and subgroup analysis.
Collapse
|
11
|
Abstract
AbstractDeep learning (henceforth DL) has become most powerful machine learning methodology. Under specific circumstances recognition rates even surpass those obtained by humans. Despite this, several works have shown that deep learning produces outputs that are very far from human responses when confronted with the same task. This the case of the so-called “adversarial examples” (henceforth AE). The fact that such implausible misclassifications exist points to a fundamental difference between machine and human learning. This paper focuses on the possible causes of this intriguing phenomenon. We first argue that the error in adversarial examples is caused by high bias, i.e. by regularization that has local negative effects. This idea is supported by our experiments in which the robustness to adversarial examples is measured with respect to the level of fitting to training samples. Higher fitting was associated to higher robustness to adversarial examples. This ties the phenomenon to the trade-off that exists in machine learning between fitting and generalization.
Collapse
|
12
|
Liu X, Lin Y, Li H, Zhang J. A novel method for malware detection on ML-based visualization technique. Comput Secur 2020. [DOI: 10.1016/j.cose.2019.101682] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|