1
|
Shao C, Li W, Huo J, Feng Z, Gao Y. Attention-based investigation and solution to the trade-off issue of adversarial training. Neural Netw 2024; 174:106224. [PMID: 38479186 DOI: 10.1016/j.neunet.2024.106224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 02/26/2024] [Accepted: 02/29/2024] [Indexed: 04/14/2024]
Abstract
Adversarial training has become the mainstream method to boost adversarial robustness of deep models. However, it often suffers from the trade-off dilemma, where the use of adversarial examples hurts the standard generalization of models on natural data. To study this phenomenon, we investigate it from the perspective of spatial attention. In brief, standard training typically encourages a model to conduct a comprehensive check to input space. But adversarial training often causes a model to overly concentrate on sparse spatial regions. This reduced tendency is beneficial to avoid adversarial accumulation but easily makes the model ignore abundant discriminative information, thereby resulting in weak generalization. To address this issue, this paper introduces an Attention-Enhanced Learning Framework (AELF) for robustness training. The main idea is to enable the model to inherit the attention pattern of standard pre-trained model through an embedding-level regularization. To be specific, given a teacher model built on natural examples, the embedding distribution of teacher model is used as a static constraint to regulate the embedding outputs of the objective model. This design is mainly supported with that the embedding feature of standard model is usually recognized as a rich semantic integration of input. For implementation, we present a simplified AELFs that can achieve the regularization with single cross entropy loss via the parameter initialization and parameter update strategy. This avoids the extra consistency comparison operation between embedding vectors. Experimental observations verify the rationality of our argument, and experimental results demonstrate that it can achieve remarkable improvements in generalization under the high-level robustness.
Collapse
Affiliation(s)
- Changbin Shao
- Department of Computer Science and Technology, Nanjing University, Nanjing 210023, China; School of Computer, Jiangsu University of Science and Technology, Zhenjiang 212100, China
| | - Wenbin Li
- Department of Computer Science and Technology, Nanjing University, Nanjing 210023, China.
| | - Jing Huo
- Department of Computer Science and Technology, Nanjing University, Nanjing 210023, China
| | - Zhenhua Feng
- School of Computer Science and Electronic Engineering, University of Surrey, Guildford GU2 7XH, UK
| | - Yang Gao
- Department of Computer Science and Technology, Nanjing University, Nanjing 210023, China
| |
Collapse
|
2
|
Wang R, Ke H, Hu M, Wu W. Adversarially robust neural networks with feature uncertainty learning and label embedding. Neural Netw 2024; 172:106087. [PMID: 38160621 DOI: 10.1016/j.neunet.2023.12.041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 12/14/2023] [Accepted: 12/22/2023] [Indexed: 01/03/2024]
Abstract
Deep neural networks (DNNs) are vulnerable to the attacks of adversarial examples, which bring serious security risks to the learning systems. In this paper, we propose a new defense method to improve the adversarial robustness of DNNs based on stochastic neural networks (SNNs), termed as Margin-SNN. The proposed Margin-SNN mainly includes two modules, i.e., feature uncertainty learning module and label embedding module. The first module introduces uncertainty to the latent feature space by giving each sample a distributional representation rather than a fixed point representation, and leverages the advantages of variational information bottleneck method in achieving good intra-class compactness in latent space. The second module develops a label embedding mechanism to take advantage of the semantic information underlying the labels, which maps the labels into the same latent space with the features, in order to capture the similarity between sample and its class centroid, where a penalty term is equipped to elegantly enlarge the margin between different classes for better inter-class separability. Since no adversarial information is introduced, the proposed model can be learned in standard training to improve adversarial robustness, which is much more efficient than adversarial training. Extensive experiments on data sets MNIST, FASHION MNIST, CIFAR10, CIFAR100 and SVHN demonstrate superior defensive ability of the proposed method. Our code is available at https://github.com/humeng24/Margin-SNN.
Collapse
Affiliation(s)
- Ran Wang
- School of Mathematical Science, Shenzhen University, Shenzhen, 518060, China; Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University, Shenzhen, 518060, China; Shenzhen Key Laboratory of Advanced Machine Learning and Applications, Shenzhen University, Shenzhen, 518060, China.
| | - Haopeng Ke
- School of Mathematical Science, Shenzhen University, Shenzhen, 518060, China.
| | - Meng Hu
- School of Mathematical Science, Shenzhen University, Shenzhen, 518060, China; College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, China.
| | - Wenhui Wu
- College of Electronic and Information Engineering, Shenzhen University, Shenzhen, 518060, China; Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University, Shenzhen, 518060, China.
| |
Collapse
|
3
|
Qian Z, Zhang S, Huang K, Wang Q, Yi X, Gu B, Xiong H. Perturbation diversity certificates robust generalization. Neural Netw 2024; 172:106117. [PMID: 38232423 DOI: 10.1016/j.neunet.2024.106117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Revised: 11/30/2023] [Accepted: 01/07/2024] [Indexed: 01/19/2024]
Abstract
Whilst adversarial training has been proven to be one most effective defending method against adversarial attacks for deep neural networks, it suffers from over-fitting on training adversarial data and thus may not guarantee the robust generalization. This may result from the fact that the conventional adversarial training methods generate adversarial perturbations usually in a supervised way so that the resulting adversarial examples are highly biased towards the decision boundary, leading to an inhomogeneous data distribution. To mitigate this limitation, we propose to generate adversarial examples from a perturbation diversity perspective. Specifically, the generated perturbed samples are not only adversarial but also diverse so as to certify robust generalization and significant robustness improvement through a homogeneous data distribution. We provide theoretical and empirical analysis, establishing a foundation to support the proposed method. As a major contribution, we prove that promoting perturbations diversity can lead to a better robust generalization bound. To verify our methods' effectiveness, we conduct extensive experiments over different datasets (e.g., CIFAR-10, CIFAR-100, SVHN) with different adversarial attacks (e.g., PGD, CW). Experimental results show that our method outperforms other state-of-the-art (e.g., PGD and Feature Scattering) in robust generalization performance.
Collapse
Affiliation(s)
- Zhuang Qian
- Department of Electrical Engineering and Electronics, University of Liverpool, United Kingdom; School of Advanced Technology, Xi'an Jiaotong-Liverpool University, China
| | - Shufei Zhang
- Shanghai Artificial Intelligence Laboratory, China
| | - Kaizhu Huang
- Data Science Research Center, Duke Kunshan University, China.
| | - Qiufeng Wang
- School of Advanced Technology, Xi'an Jiaotong-Liverpool University, China.
| | - Xinping Yi
- Department of Electrical Engineering and Electronics, University of Liverpool, United Kingdom
| | - Bin Gu
- Mohamed bin Zayed University of Artificial Intelligence, United Arab Emirates
| | - Huan Xiong
- Mohamed bin Zayed University of Artificial Intelligence, United Arab Emirates; Institute for Advanced Study in Mathematics, Harbin Institute of Technology, China
| |
Collapse
|
4
|
Mu R, Marcolino L, Ni Q, Ruan W. Enhancing robustness in video recognition models: Sparse adversarial attacks and beyond. Neural Netw 2024; 171:127-143. [PMID: 38091756 DOI: 10.1016/j.neunet.2023.11.056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 11/15/2023] [Accepted: 11/24/2023] [Indexed: 01/29/2024]
Abstract
Recent years have witnessed increasing interest in adversarial attacks on images, while adversarial video attacks have seldom been explored. In this paper, we propose a sparse adversarial attack strategy on videos (DeepSAVA). Our model aims to add a small human-imperceptible perturbation to the key frame of the input video to fool the classifiers. To carry out an effective attack that mirrors real-world scenarios, our algorithm integrates spatial transformation perturbations into the frame. Instead of using the lp norm to gauge the disparity between the perturbed frame and the original frame, we employ the structural similarity index (SSIM), which has been established as a more suitable metric for quantifying image alterations resulting from spatial perturbations. We employ a unified optimisation framework to combine spatial transformation with additive perturbation, thereby attaining a more potent attack. We design an effective and novel optimisation scheme that alternatively utilises Bayesian Optimisation (BO) to identify the most critical frame in a video and stochastic gradient descent (SGD) based optimisation to produce both additive and spatial-transformed perturbations. Doing so enables DeepSAVA to perform a very sparse attack on videos for maintaining human imperceptibility while still achieving state-of-the-art performance in terms of both attack success rate and adversarial transferability. Furthermore, built upon the strong perturbations produced by DeepSAVA, we design a novel adversarial training framework to improve the robustness of video classification models. Our intensive experiments on various types of deep neural networks and video datasets confirm the superiority of DeepSAVA in terms of attacking performance and efficiency. When compared to the baseline techniques, DeepSAVA exhibits the highest level of performance in generating adversarial videos for three distinct video classifiers. Remarkably, it achieves an impressive fooling rate ranging from 99.5% to 100% for the I3D model, with the perturbation of just a single frame. Additionally, DeepSAVA demonstrates favourable transferability across various time series models. The proposed adversarial training strategy is also empirically demonstrated with better performance on training robust video classifiers compared with the state-of-the-art adversarial training with projected gradient descent (PGD) adversary.
Collapse
Affiliation(s)
- Ronghui Mu
- Department of Computer Science, University of Liverpool, Liverpool, UK.
| | - Leandro Marcolino
- School of Computing and Communications, Lancaster University, Lancaster, UK.
| | - Qiang Ni
- School of Computing and Communications, Lancaster University, Lancaster, UK.
| | - Wenjie Ruan
- Department of Computer Science, University of Liverpool, Liverpool, UK.
| |
Collapse
|
5
|
He L, Ai Q, Yang X, Ren Y, Wang Q, Xu Z. Boosting adversarial robustness via self-paced adversarial training. Neural Netw 2023; 167:706-714. [PMID: 37729786 DOI: 10.1016/j.neunet.2023.08.063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Revised: 08/10/2023] [Accepted: 08/31/2023] [Indexed: 09/22/2023]
Abstract
Adversarial training is considered one of the most effective methods to improve the adversarial robustness of deep neural networks. Despite the success, it still suffers from unsatisfactory performance and overfitting. Considering the intrinsic mechanism of adversarial training, recent studies adopt the idea of curriculum learning to alleviate overfitting. However, this also introduces new issues, that is, lacking the quantitative criterion for attacks' strength and catastrophic forgetting. To mitigate such issues, we propose the self-paced adversarial training (SPAT), which explicitly builds the learning process of adversarial training based on adversarial examples of the whole dataset. Specifically, our model is first trained with "easy" adversarial examples, and then is continuously enhanced by gradually adding "complex" adversarial examples. This way strengthens the ability to fit "complex" adversarial examples while holding in mind "easy" adversarial samples. To balance adversarial examples between classes, we determine the difficulty of the adversarial examples locally in each class. Notably, this learning paradigm can also be incorporated into other advanced methods for further boosting adversarial robustness. Experimental results show the effectiveness of our proposed model against various attacks on widely-used benchmarks. Especially, on CIFAR100, SPAT provides a boost of 1.7% (relatively 5.4%) in robust accuracy on the PGD10 attack and 3.9% (relatively 7.2%) in natural accuracy for AWP.
Collapse
Affiliation(s)
- Lirong He
- School of Information Science and Engineering, Chongqing Jiaotong University, Chongqing, China; School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Qingzhong Ai
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Xincheng Yang
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Yazhou Ren
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | | | - Zenglin Xu
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China; School of Computer Science and Technology, Harbin Institute of Technology Shenzhen, Shenzhen, Guangdong, China; Peng Cheng Lab, Shenzhen, Guangdong, China.
| |
Collapse
|
6
|
Kim H, Lee W, Lee S, Lee J. Bridged adversarial training. Neural Netw 2023; 167:266-282. [PMID: 37666185 DOI: 10.1016/j.neunet.2023.08.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 06/07/2023] [Accepted: 08/13/2023] [Indexed: 09/06/2023]
Abstract
Adversarial robustness is considered a required property of deep neural networks. In this study, we discover that adversarially trained models might have significantly different characteristics in terms of margin and smoothness, even though they show similar robustness. Inspired by the observation, we investigate the effect of different regularizers and discover the negative effect of the smoothness regularizer on maximizing the margin. Based on the analyses, we propose a new method called bridged adversarial training that mitigates the negative effect by bridging the gap between clean and adversarial examples. We provide theoretical and empirical evidence that the proposed method provides stable and better robustness, especially for large perturbations.
Collapse
Affiliation(s)
- Hoki Kim
- Institute of Engineering Research, Seoul National University, Gwanak-gu 08826, Republic of Korea
| | - Woojin Lee
- School of AI Convergence, Dongguk University-Seoul, Jung-gu 04620, Republic of Korea
| | - Sungyoon Lee
- Department of Computer Science, Hanyang University, Seongdong-gu 04763, Republic of Korea
| | - Jaewook Lee
- Department of Industrial Engineering, Seoul National University, Gwanak-gu 08826, Republic of Korea.
| |
Collapse
|
7
|
Ma L, Liang L. Increasing-Margin Adversarial (IMA) training to improve adversarial robustness of neural networks. Comput Methods Programs Biomed 2023; 240:107687. [PMID: 37392695 PMCID: PMC10527180 DOI: 10.1016/j.cmpb.2023.107687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 06/09/2023] [Accepted: 06/21/2023] [Indexed: 07/03/2023]
Abstract
BACKGROUND AND OBJECTIVE Deep neural networks (DNNs) are vulnerable to adversarial noises. Adversarial training is a general and effective strategy to improve DNN robustness (i.e., accuracy on noisy data) against adversarial noises. However, DNN models trained by the current existing adversarial training methods may have much lower standard accuracy (i.e., accuracy on clean data), compared to the same models trained by the standard method on clean data, and this phenomenon is known as the trade-off between accuracy and robustness and is commonly considered unavoidable. This issue prevents adversarial training from being used in many application domains, such as medical image analysis, as practitioners do not want to sacrifice standard accuracy too much in exchange for adversarial robustness. Our objective is to lift (i.e., alleviate or even avoid) this trade-off between standard accuracy and adversarial robustness for medical image classification and segmentation. METHODS We propose a novel adversarial training method, named Increasing-Margin Adversarial (IMA) Training, which is supported by an equilibrium state analysis about the optimality of adversarial training samples. Our method aims to preserve accuracy while improving robustness by generating optimal adversarial training samples. We evaluate our method and the other eight representative methods on six publicly available image datasets corrupted by noises generated by AutoAttack and white-noise attack. RESULTS Our method achieves the highest adversarial robustness for image classification and segmentation with the smallest reduction in accuracy on clean data. For one of the applications, our method improves both accuracy and robustness. CONCLUSIONS Our study has demonstrated that our method can lift the trade-off between standard accuracy and adversarial robustness for the image classification and segmentation applications. To our knowledge, it is the first work to show that the trade-off is avoidable for medical image segmentation.
Collapse
Affiliation(s)
- Linhai Ma
- Department of Computer Science, University of Miami, 1365 Memorial Drive, Coral Gables, 33146, FL, USA.
| | - Liang Liang
- Department of Computer Science, University of Miami, 1365 Memorial Drive, Coral Gables, 33146, FL, USA.
| |
Collapse
|
8
|
Zhang H, Cheng J, Zhang J, Liu H, Wei Z. A regularization perspective based theoretical analysis for adversarial robustness of deep spiking neural networks. Neural Netw 2023; 165:164-174. [PMID: 37295205 DOI: 10.1016/j.neunet.2023.05.038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 04/27/2023] [Accepted: 05/19/2023] [Indexed: 06/12/2023]
Abstract
Spiking Neural Network (SNN) has been recognized as the third generation of neural networks. Conventionally, a SNN can be converted from a pre-trained Artificial Neural Network (ANN) with less computation and memory than training from scratch. But, these converted SNNs are vulnerable to adversarial attacks. Numerical experiments demonstrate that the SNN trained by optimizing the loss function will be more adversarial robust, but the theoretical analysis for the mechanism of robustness is lacking. In this paper, we provide a theoretical explanation by analyzing the expected risk function. Starting by modeling the stochastic process introduced by the Poisson encoder, we prove that there is a positive semidefinite regularizer. Perhaps surprisingly, this regularizer can make the gradients of the output with respect to input closer to zero, thus resulting in inherent robustness against adversarial attacks. Extensive experiments on the CIFAR10 and CIFAR100 datasets support our point of view. For example, we find that the sum of squares of the gradients of the converted SNNs is 13∼160 times that of the trained SNNs. And, the smaller the sum of the squares of the gradients, the smaller the degradation of accuracy under adversarial attack.
Collapse
Affiliation(s)
- Hui Zhang
- Nanjing University of Science and Technology, Nanjing, 210094, China
| | | | - Jun Zhang
- Nanjing University of Science and Technology, Nanjing, 210094, China
| | - Hongyi Liu
- Nanjing University of Science and Technology, Nanjing, 210094, China
| | - Zhihui Wei
- Nanjing University of Science and Technology, Nanjing, 210094, China.
| |
Collapse
|
9
|
Manzari ON, Ahmadabadi H, Kashiani H, Shokouhi SB, Ayatollahi A. MedViT: A robust vision transformer for generalized medical image classification. Comput Biol Med 2023; 157:106791. [PMID: 36958234 DOI: 10.1016/j.compbiomed.2023.106791] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 02/18/2023] [Accepted: 03/11/2023] [Indexed: 03/16/2023]
Abstract
Convolutional Neural Networks (CNNs) have advanced existing medical systems for automatic disease diagnosis. However, there are still concerns about the reliability of deep medical diagnosis systems against the potential threats of adversarial attacks since inaccurate diagnosis could lead to disastrous consequences in the safety realm. In this study, we propose a highly robust yet efficient CNN-Transformer hybrid model which is equipped with the locality of CNNs as well as the global connectivity of vision Transformers. To mitigate the high quadratic complexity of the self-attention mechanism while jointly attending to information in various representation subspaces, we construct our attention mechanism by means of an efficient convolution operation. Moreover, to alleviate the fragility of our Transformer model against adversarial attacks, we attempt to learn smoother decision boundaries. To this end, we augment the shape information of an image in the high-level feature space by permuting the feature mean and variance within mini-batches. With less computational complexity, our proposed hybrid model demonstrates its high robustness and generalization ability compared to the state-of-the-art studies on a large-scale collection of standardized MedMNIST-2D datasets.
Collapse
Affiliation(s)
- Omid Nejati Manzari
- School of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran.
| | - Hamid Ahmadabadi
- School of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran
| | - Hossein Kashiani
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, USA
| | - Shahriar B Shokouhi
- School of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran
| | - Ahmad Ayatollahi
- School of Electrical Engineering, Iran University of Science and Technology, Tehran, Iran
| |
Collapse
|
10
|
Rodriguez D, Nayak T, Chen Y, Krishnan R, Huang Y. On the role of deep learning model complexity in adversarial robustness for medical images. BMC Med Inform Decis Mak 2022; 22:160. [PMID: 35725429 PMCID: PMC9208111 DOI: 10.1186/s12911-022-01891-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 05/23/2022] [Indexed: 11/29/2022] Open
Abstract
Background Deep learning (DL) models are highly vulnerable to adversarial attacks for medical image classification. An adversary could modify the input data in imperceptible ways such that a model could be tricked to predict, say, an image that actually exhibits malignant tumor to a prediction that it is benign. However, adversarial robustness of DL models for medical images is not adequately studied. DL in medicine is inundated with models of various complexity—particularly, very large models. In this work, we investigate the role of model complexity in adversarial settings. Results Consider a set of DL models that exhibit similar performances for a given task. These models are trained in the usual manner but are not trained to defend against adversarial attacks. We demonstrate that, among those models, simpler models of reduced complexity show a greater level of robustness against adversarial attacks than larger models that often tend to be used in medical applications. On the other hand, we also show that once those models undergo adversarial training, the adversarial trained medical image DL models exhibit a greater degree of robustness than the standard trained models for all model complexities. Conclusion The above result has a significant practical relevance. When medical practitioners lack the expertise or resources to defend against adversarial attacks, we recommend that they select the smallest of the models that exhibit adequate performance. Such a model would be naturally more robust to adversarial attacks than the larger models.
Collapse
Affiliation(s)
- David Rodriguez
- Department of Electrical and Computer Engineering, University of Texas at San Antonio, San Antonio, TX, USA
| | - Tapsya Nayak
- Greehey Children's Cancer Research Institute, University of Texas Health San Antonio, San Antonio, TX, USA
| | - Yidong Chen
- Greehey Children's Cancer Research Institute, University of Texas Health San Antonio, San Antonio, TX, USA.,Department of Population Health Sciences, University of Texas Health San Antonio, San Antonio, TX, USA
| | - Ram Krishnan
- Department of Electrical and Computer Engineering, University of Texas at San Antonio, San Antonio, TX, USA
| | - Yufei Huang
- Department of Medicine, School of Medicine, UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, USA.
| |
Collapse
|
11
|
Panda P, Roy K. Implicit adversarial data augmentation and robustness with Noise-based Learning. Neural Netw 2021; 141:120-132. [PMID: 33894652 DOI: 10.1016/j.neunet.2021.04.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 02/20/2021] [Accepted: 04/05/2021] [Indexed: 11/25/2022]
Abstract
We introduce a Noise-based Learning (NoL) approach for training neural networks that are intrinsically robust to adversarial attacks. We find that the learning of random noise introduced with the input with the same loss function used during posterior maximization, improves a model's adversarial resistance. We show that the learnt noise performs implicit adversarial data augmentation boosting a model's adversary generalization capability. We evaluate our approach's efficacy and provide a simplistic visualization tool for understanding adversarial data, using Principal Component Analysis. We conduct comprehensive experiments on prevailing benchmarks such as MNIST, CIFAR10, CIFAR100, Tiny ImageNet and show that our approach performs remarkably well against a wide range of attacks. Furthermore, combining NoL with state-of-the-art defense mechanisms, such as adversarial training, consistently outperforms prior techniques in both white-box and black-box attacks.
Collapse
Affiliation(s)
| | - Kaushik Roy
- School of Electrical and Computer Engineering, Purdue University, West Lafayette, USA.
| |
Collapse
|