1
|
Alkadi S, Al-Ahmadi S, Ben Ismail MM. RobEns: Robust Ensemble Adversarial Machine Learning Framework for Securing IoT Traffic. Sensors (Basel) 2024; 24:2626. [PMID: 38676241 PMCID: PMC11053586 DOI: 10.3390/s24082626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 03/29/2024] [Accepted: 04/10/2024] [Indexed: 04/28/2024]
Abstract
Recently, Machine Learning (ML)-based solutions have been widely adopted to tackle the wide range of security challenges that have affected the progress of the Internet of Things (IoT) in various domains. Despite the reported promising results, the ML-based Intrusion Detection System (IDS) proved to be vulnerable to adversarial examples, which pose an increasing threat. In fact, attackers employ Adversarial Machine Learning (AML) to cause severe performance degradation and thereby evade detection systems. This promoted the need for reliable defense strategies to handle performance and ensure secure networks. This work introduces RobEns, a robust ensemble framework that aims at: (i) exploiting state-of-the-art ML-based models alongside ensemble models for IDSs in the IoT network; (ii) investigating the impact of evasion AML attacks against the provided models within a black-box scenario; and (iii) evaluating the robustness of the considered models after deploying relevant defense methods. In particular, four typical AML attacks are considered to investigate six ML-based IDSs using three benchmarking datasets. Moreover, multi-class classification scenarios are designed to assess the performance of each attack type. The experiments indicated a drastic drop in detection accuracy for some attempts. To harden the IDS even further, two defense mechanisms were derived from both data-based and model-based methods. Specifically, these methods relied on feature squeezing as well as adversarial training defense strategies. They yielded promising results, enhanced robustness, and maintained standard accuracy in the presence or absence of adversaries. The obtained results proved the efficiency of the proposed framework in robustifying IDS performance within the IoT context. In particular, the accuracy reached 100% for black-box attack scenarios while preserving the accuracy in the absence of attacks as well.
Collapse
Affiliation(s)
- Sarah Alkadi
- Department of Computer Science, College of Computer and Information Sciences, King Saud University, Riyadh 11362, Saudi Arabia; (S.A.-A.); (M.M.B.I.)
| | | | | |
Collapse
|
2
|
Alkhowaiter M, Kholidy H, Alyami MA, Alghamdi A, Zou C. Adversarial-Aware Deep Learning System Based on a Secondary Classical Machine Learning Verification Approach. Sensors (Basel) 2023; 23:6287. [PMID: 37514582 PMCID: PMC10384939 DOI: 10.3390/s23146287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 07/06/2023] [Accepted: 07/08/2023] [Indexed: 07/30/2023]
Abstract
Deep learning models have been used in creating various effective image classification applications. However, they are vulnerable to adversarial attacks that seek to misguide the models into predicting incorrect classes. Our study of major adversarial attack models shows that they all specifically target and exploit the neural networking structures in their designs. This understanding led us to develop a hypothesis that most classical machine learning models, such as random forest (RF), are immune to adversarial attack models because they do not rely on neural network design at all. Our experimental study of classical machine learning models against popular adversarial attacks supports this hypothesis. Based on this hypothesis, we propose a new adversarial-aware deep learning system by using a classical machine learning model as the secondary verification system to complement the primary deep learning model in image classification. Although the secondary classical machine learning model has less accurate output, it is only used for verification purposes, which does not impact the output accuracy of the primary deep learning model, and, at the same time, can effectively detect an adversarial attack when a clear mismatch occurs. Our experiments based on the CIFAR-100 dataset show that our proposed approach outperforms current state-of-the-art adversarial defense systems.
Collapse
Affiliation(s)
- Mohammed Alkhowaiter
- College of Engineering and Computer Science, University of Central Florida, Orlando, FL 32816, USA
- College of Computer Engineering and Science, Prince Sattam Bin Abdulaziz University, Al-Kharj 11942, Saudi Arabia
| | - Hisham Kholidy
- College of Engineering, SUNY Polytechnic Institute, Utica, NY 13502, USA
| | - Mnassar A Alyami
- College of Engineering and Computer Science, University of Central Florida, Orlando, FL 32816, USA
| | - Abdulmajeed Alghamdi
- College of Engineering and Computer Science, University of Central Florida, Orlando, FL 32816, USA
| | - Cliff Zou
- College of Engineering and Computer Science, University of Central Florida, Orlando, FL 32816, USA
| |
Collapse
|
3
|
Lucieri A, Dengel A, Ahmed S. Translating theory into practice: assessing the privacy implications of concept-based explanations for biomedical AI. Front Bioinform 2023; 3:1194993. [PMID: 37484865 PMCID: PMC10356902 DOI: 10.3389/fbinf.2023.1194993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 06/20/2023] [Indexed: 07/25/2023] Open
Abstract
Artificial Intelligence (AI) has achieved remarkable success in image generation, image analysis, and language modeling, making data-driven techniques increasingly relevant in practical real-world applications, promising enhanced creativity and efficiency for human users. However, the deployment of AI in high-stakes domains such as infrastructure and healthcare still raises concerns regarding algorithm accountability and safety. The emerging field of explainable AI (XAI) has made significant strides in developing interfaces that enable humans to comprehend the decisions made by data-driven models. Among these approaches, concept-based explainability stands out due to its ability to align explanations with high-level concepts familiar to users. Nonetheless, early research in adversarial machine learning has unveiled that exposing model explanations can render victim models more susceptible to attacks. This is the first study to investigate and compare the impact of concept-based explanations on the privacy of Deep Learning based AI models in the context of biomedical image analysis. An extensive privacy benchmark is conducted on three different state-of-the-art model architectures (ResNet50, NFNet, ConvNeXt) trained on two biomedical (ISIC and EyePACS) and one synthetic dataset (SCDB). The success of membership inference attacks while exposing varying degrees of attribution-based and concept-based explanations is systematically compared. The findings indicate that, in theory, concept-based explanations can potentially increase the vulnerability of a private AI system by up to 16% compared to attributions in the baseline setting. However, it is demonstrated that, in more realistic attack scenarios, the threat posed by explanations is negligible in practice. Furthermore, actionable recommendations are provided to ensure the safe deployment of concept-based XAI systems. In addition, the impact of differential privacy (DP) on the quality of concept-based explanations is explored, revealing that while negatively influencing the explanation ability, DP can have an adverse effect on the models' privacy.
Collapse
Affiliation(s)
- Adriano Lucieri
- Smart Data and Knowledge Services (SDS), Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI) GmbH, Kaiserslautern, Germany
- Computer Science Department, RPTU Kaiserslautern-Landau, Kaiserslautern, Germany
| | - Andreas Dengel
- Smart Data and Knowledge Services (SDS), Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI) GmbH, Kaiserslautern, Germany
- Computer Science Department, RPTU Kaiserslautern-Landau, Kaiserslautern, Germany
| | - Sheraz Ahmed
- Smart Data and Knowledge Services (SDS), Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI) GmbH, Kaiserslautern, Germany
| |
Collapse
|
4
|
Shao K, Yang J, Hu P, Li X. A Textual Backdoor Defense Method Based on Deep Feature Classification. Entropy (Basel) 2023; 25:220. [PMID: 36832587 PMCID: PMC9955932 DOI: 10.3390/e25020220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 01/12/2023] [Accepted: 01/16/2023] [Indexed: 06/18/2023]
Abstract
Natural language processing (NLP) models based on deep neural networks (DNNs) are vulnerable to backdoor attacks. Existing backdoor defense methods have limited effectiveness and coverage scenarios. We propose a textual backdoor defense method based on deep feature classification. The method includes deep feature extraction and classifier construction. The method exploits the distinguishability of deep features of poisoned data and benign data. Backdoor defense is implemented in both offline and online scenarios. We conducted defense experiments on two datasets and two models for a variety of backdoor attacks. The experimental results demonstrate the effectiveness of this defense approach and outperform the baseline defense method.
Collapse
|
5
|
Rohanian O, Kouchaki S, Soltan A, Yang J, Rohanian M, Yang Y, Clifton D. Privacy-Aware Early Detection of COVID-19 Through Adversarial Training. IEEE J Biomed Health Inform 2022; PP:1249-1258. [PMID: 37015447 PMCID: PMC10824398 DOI: 10.1109/jbhi.2022.3230663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 11/25/2022] [Accepted: 12/12/2022] [Indexed: 12/24/2022]
Abstract
Early detection of COVID-19 is an ongoing area of research that can help with triage, monitoring and general health assessment of potential patients and may reduce operational strain on hospitals that cope with the coronavirus pandemic. Different machine learning techniques have been used in the literature to detect potential cases of coronavirus using routine clinical data (blood tests, and vital signs measurements). Data breaches and information leakage when using these models can bring reputational damage and cause legal issues for hospitals. In spite of this, protecting healthcare models against leakage of potentially sensitive information is an understudied research area. In this study, two machine learning techniques that aim to predict a patient's COVID-19 status are examined. Using adversarial training, robust deep learning architectures are explored with the aim to protect attributes related to demographic information about the patients. The two models examined in this work are intended to preserve sensitive information against adversarial attacks and information leakage. In a series of experiments using datasets from the Oxford University Hospitals (OUH), Bedfordshire Hospitals NHS Foundation Trust (BH), University Hospitals Birmingham NHS Foundation Trust (UHB), and Portsmouth Hospitals University NHS Trust (PUH), two neural networks are trained and evaluated. These networks predict PCR test results using information from basic laboratory blood tests, and vital signs collected from a patient upon arrival to the hospital. The level of privacy each one of the models can provide is assessed and the efficacy and robustness of the proposed architectures are compared with a relevant baseline. One of the main contributions in this work is the particular focus on the development of effective COVID-19 detection models with built-in mechanisms in order to selectively protect sensitive attributes against adversarial attacks. The results on hold-out test set and external validation confirmed that there was no impact on the generalisibility of the model using adversarial learning.
Collapse
Affiliation(s)
- Omid Rohanian
- Department of Engineering ScienceUniversity of OxfordOxfordOX3 7DQU.K.
| | - Samaneh Kouchaki
- Centre for Vision, Speech and Signal ProcessingUniversity of SurreyGU2 7XHGuildfordU.K.
- U.K. Dementia Research Institute Care Research and Technology CentreImperial College LondonSW7 2BXLondonU.K.
- University of SurreyGU2 7XHGuildfordU.K.
| | - Andrew Soltan
- John Radcliffe HospitalOxford University Hospitals NHS Foundation TrustOX3 7DQOxfordU.K.
- RDM Division of Cardiovascular MedicineUniversity of OxfordOX3 7DQOxfordU.K.
| | - Jenny Yang
- Department of Engineering ScienceUniversity of OxfordOxfordOX3 7DQU.K.
| | | | - Yang Yang
- Department of Engineering ScienceUniversity of OxfordOxfordOX3 7DQU.K.
- School of Public HealthShanghai Jiao Tong UniversityShanghai200240China
- School of MedicineShanghai Jiao Tong UniversityShanghai200240China
| | - David Clifton
- Department of Engineering ScienceUniversity of OxfordOxfordOX3 7DQU.K.
- Oxford-China Centre for Advanced ResearchSuzhou215123China
| |
Collapse
|
6
|
Zong W, Chow YW, Susilo W, Kim J, Le NT. Detecting Audio Adversarial Examples in Automatic Speech Recognition Systems Using Decision Boundary Patterns. J Imaging 2022; 8. [PMID: 36547489 DOI: 10.3390/jimaging8120324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 12/03/2022] [Accepted: 12/06/2022] [Indexed: 12/14/2022] Open
Abstract
Automatic Speech Recognition (ASR) systems are ubiquitous in various commercial applications. These systems typically rely on machine learning techniques for transcribing voice commands into text for further processing. Despite their success in many applications, audio Adversarial Examples (AEs) have emerged as a major security threat to ASR systems. This is because audio AEs are able to fool ASR models into producing incorrect results. While researchers have investigated methods for defending against audio AEs, the intrinsic properties of AEs and benign audio are not well studied. The work in this paper shows that the machine learning decision boundary patterns around audio AEs and benign audio are fundamentally different. Using dimensionality-reduction techniques, this work shows that these different patterns can be visually distinguished in two-dimensional (2D) space. This in turn allows for the detection of audio AEs using anomal- detection methods.
Collapse
|
7
|
Benaddi H, Jouhari M, Ibrahimi K, Ben Othman J, Amhoud EM. Anomaly Detection in Industrial IoT Using Distributional Reinforcement Learning and Generative Adversarial Networks. Sensors (Basel) 2022; 22:8085. [PMID: 36365782 PMCID: PMC9656136 DOI: 10.3390/s22218085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Revised: 10/12/2022] [Accepted: 10/19/2022] [Indexed: 06/16/2023]
Abstract
Anomaly detection is one of the biggest issues of security in the Industrial Internet of Things (IIoT) due to the increase in cyber attack dangers for distributed devices and critical infrastructure networks. To face these challenges, the Intrusion Detection System (IDS) is suggested as a robust mechanism to protect and monitor malicious activities in IIoT networks. In this work, we suggest a new mechanism to improve the efficiency and robustness of the IDS system using Distributional Reinforcement Learning (DRL) and the Generative Adversarial Network (GAN). We aim to develop realistic and equilibrated distribution for a given feature set using artificial data in order to overcome the issue of data imbalance. We show how the GAN can efficiently assist the distributional RL-based-IDS in enhancing the detection of minority attacks. To assess the taxonomy of our approach, we verified the effectiveness of our algorithm by using the Distributed Smart Space Orchestration System (DS2OS) dataset. The performance of the normal DRL and DRL-GAN models in binary and multiclass classifications was evaluated based on anomaly detection datasets. The proposed models outperformed the normal DRL in the standard metrics of accuracy, precision, recall, and F1 score. We demonstrated that the GAN introduced in the training process of DRL with the aim of improving the detection of a specific class of data achieves the best results.
Collapse
Affiliation(s)
- Hafsa Benaddi
- Laboratory of Research in Informatics (LaRI), Faculty of Sciences, Ibn Tofail University, Kenitra 14000, Morocco
| | - Mohammed Jouhari
- School of Computer Science, Mohammed VI Polytechnic University, Ben Guerir 43150, Morocco
| | - Khalil Ibrahimi
- Laboratory of Research in Informatics (LaRI), Faculty of Sciences, Ibn Tofail University, Kenitra 14000, Morocco
| | - Jalel Ben Othman
- L2S Laboratory, Paris-Saclay University, CNRS, Centralesupelec, 91190 Gif-sur-Yvette, France
| | - El Mehdi Amhoud
- School of Computer Science, Mohammed VI Polytechnic University, Ben Guerir 43150, Morocco
| |
Collapse
|
8
|
Anastasiou T, Karagiorgou S, Petrou P, Papamartzivanos D, Giannetsos T, Tsirigotaki G, Keizer J. Towards Robustifying Image Classifiers against the Perils of Adversarial Attacks on Artificial Intelligence Systems. Sensors (Basel) 2022; 22:6905. [PMID: 36146258 PMCID: PMC9506202 DOI: 10.3390/s22186905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 09/07/2022] [Accepted: 09/08/2022] [Indexed: 06/16/2023]
Abstract
Adversarial machine learning (AML) is a class of data manipulation techniques that cause alterations in the behavior of artificial intelligence (AI) systems while going unnoticed by humans. These alterations can cause serious vulnerabilities to mission-critical AI-enabled applications. This work introduces an AI architecture augmented with adversarial examples and defense algorithms to safeguard, secure, and make more reliable AI systems. This can be conducted by robustifying deep neural network (DNN) classifiers and explicitly focusing on the specific case of convolutional neural networks (CNNs) used in non-trivial manufacturing environments prone to noise, vibrations, and errors when capturing and transferring data. The proposed architecture enables the imitation of the interplay between the attacker and a defender based on the deployment and cross-evaluation of adversarial and defense strategies. The AI architecture enables (i) the creation and usage of adversarial examples in the training process, which robustify the accuracy of CNNs, (ii) the evaluation of defense algorithms to recover the classifiers' accuracy, and (iii) the provision of a multiclass discriminator to distinguish and report on non-attacked and attacked data. The experimental results show promising results in a hybrid solution combining the defense algorithms and the multiclass discriminator in an effort to revitalize the attacked base models and robustify the DNN classifiers. The proposed architecture is ratified in the context of a real manufacturing environment utilizing datasets stemming from the actual production lines.
Collapse
Affiliation(s)
| | | | - Petros Petrou
- UBITECH Ltd., Thessalias 8 and Etolias 10, GR-15231 Chalandri, Greece
| | | | | | - Georgia Tsirigotaki
- Hellenic Army Information Technology Support Center, 227-231, Mesogeion Ave., GR-15451 Holargos, Greece
| | - Jelle Keizer
- Philips, Oliemolenstraat 5, 9203 ZN Drachten, The Netherlands
| |
Collapse
|
9
|
Baia AE, Biondi G, Franzoni V, Milani A, Poggioni V. Lie to Me: Shield Your Emotions from Prying Software. Sensors (Basel) 2022; 22:s22030967. [PMID: 35161713 PMCID: PMC8840139 DOI: 10.3390/s22030967] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/01/2022] [Revised: 01/22/2022] [Accepted: 01/22/2022] [Indexed: 01/27/2023]
Abstract
Deep learning approaches for facial Emotion Recognition (ER) obtain high accuracy on basic models, e.g., Ekman’s models, in the specific domain of facial emotional expressions. Thus, facial tracking of users’ emotions could be easily used against the right to privacy or for manipulative purposes. As recent studies have shown that deep learning models are susceptible to adversarial examples (images intentionally modified to fool a machine learning classifier) we propose to use them to preserve users’ privacy against ER. In this paper, we present a technique for generating Emotion Adversarial Attacks (EAAs). EAAs are performed applying well-known image filters inspired from Instagram, and a multi-objective evolutionary algorithm is used to determine the per-image best filters attacking combination. Experimental results on the well-known AffectNet dataset of facial expressions show that our approach successfully attacks emotion classifiers to protect user privacy. On the other hand, the quality of the images from the human perception point of view is maintained. Several experiments with different sequences of filters are run and show that the Attack Success Rate is very high, above 90% for every test.
Collapse
Affiliation(s)
- Alina Elena Baia
- Department of Mathematics and Computer Science, University of Florence, Viale Morgagni 67/a, 50134 Florence, Italy;
| | - Giulio Biondi
- Department of Mathematics and Computer Science, University of Perugia, Via Vanvitelli 1, 06123 Perugia, Italy; (G.B.); or (A.M.)
| | - Valentina Franzoni
- Department of Mathematics and Computer Science, University of Perugia, Via Vanvitelli 1, 06123 Perugia, Italy; (G.B.); or (A.M.)
- Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong, China
- Correspondence: (V.F.); (V.P.)
| | - Alfredo Milani
- Department of Mathematics and Computer Science, University of Perugia, Via Vanvitelli 1, 06123 Perugia, Italy; (G.B.); or (A.M.)
| | - Valentina Poggioni
- Department of Mathematics and Computer Science, University of Perugia, Via Vanvitelli 1, 06123 Perugia, Italy; (G.B.); or (A.M.)
- Correspondence: (V.F.); (V.P.)
| |
Collapse
|
10
|
Mahmood K, Gurevin D, van Dijk M, Nguyen PH. Beware the Black-Box: On the Robustness of Recent Defenses to Adversarial Examples. Entropy (Basel) 2021; 23:e23101359. [PMID: 34682083 PMCID: PMC8534430 DOI: 10.3390/e23101359] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Revised: 10/14/2021] [Accepted: 10/14/2021] [Indexed: 11/27/2022]
Abstract
Many defenses have recently been proposed at venues like NIPS, ICML, ICLR and CVPR. These defenses are mainly focused on mitigating white-box attacks. They do not properly examine black-box attacks. In this paper, we expand upon the analyses of these defenses to include adaptive black-box adversaries. Our evaluation is done on nine defenses including Barrage of Random Transforms, ComDefend, Ensemble Diversity, Feature Distillation, The Odds are Odd, Error Correcting Codes, Distribution Classifier Defense, K-Winner Take All and Buffer Zones. Our investigation is done using two black-box adversarial models and six widely studied adversarial attacks for CIFAR-10 and Fashion-MNIST datasets. Our analyses show most recent defenses (7 out of 9) provide only marginal improvements in security (<25%), as compared to undefended networks. For every defense, we also show the relationship between the amount of data the adversary has at their disposal, and the effectiveness of adaptive black-box attacks. Overall, our results paint a clear picture: defenses need both thorough white-box and black-box analyses to be considered secure. We provide this large scale study and analyses to motivate the field to move towards the development of more robust black-box defenses.
Collapse
Affiliation(s)
- Kaleel Mahmood
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269, USA
- Correspondence:
| | - Deniz Gurevin
- Department of Electrical and Computer Engineering, University of Connecticut, Storrs, CT 06269, USA;
| | | | | |
Collapse
|
11
|
Menéndez HD, Clark D, T Barr E. Getting Ahead of the Arms Race: Hothousing the Coevolution of VirusTotal with a Packer. Entropy (Basel) 2021; 23:395. [PMID: 33810471 DOI: 10.3390/e23040395] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Revised: 03/19/2021] [Accepted: 03/23/2021] [Indexed: 11/26/2022]
Abstract
Malware detection is in a coevolutionary arms race where the attackers and defenders are constantly seeking advantage. This arms race is asymmetric: detection is harder and more expensive than evasion. White hats must be conservative to avoid false positives when searching for malicious behaviour. We seek to redress this imbalance. Most of the time, black hats need only make incremental changes to evade them. On occasion, white hats make a disruptive move and find a new technique that forces black hats to work harder. Examples include system calls, signatures and machine learning. We present a method, called Hothouse, that combines simulation and search to accelerate the white hat’s ability to counter the black hat’s incremental moves, thereby forcing black hats to perform disruptive moves more often. To realise Hothouse, we evolve EEE, an entropy-based polymorphic packer for Windows executables. Playing the role of a black hat, EEE uses evolutionary computation to disrupt the creation of malware signatures. We enter EEE into the detection arms race with VirusTotal, the most prominent cloud service for running anti-virus tools on software. During our 6 month study, we continually improved EEE in response to VirusTotal, eventually learning a packer that produces packed malware whose evasiveness goes from an initial 51.8% median to 19.6%. We report both how well VirusTotal learns to detect EEE-packed binaries and how well VirusTotal forgets in order to reduce false positives. VirusTotal’s tools learn and forget fast, actually in about 3 days. We also show where VirusTotal focuses its detection efforts, by analysing EEE’s variants.
Collapse
|
12
|
Abstract
Machine Learning (ML) algorithms, specifically supervised learning, are widely used in modern real-world applications, which utilize Computational Intelligence (CI) as their core technology, such as autonomous vehicles, assistive robots, and biometric systems. Attacks that cause misclassifications or mispredictions can lead to erroneous decisions resulting in unreliable operations. Designing robust ML with the ability to provide reliable results in the presence of such attacks has become a top priority in the field of adversarial machine learning. An essential characteristic for rapid development of robust ML is an arms race between attack and defense strategists. However, an important prerequisite for the arms race is access to a well-defined system model so that experiments can be repeated by independent researchers. This paper proposes a fine-grained system-driven taxonomy to specify ML applications and adversarial system models in an unambiguous manner such that independent researchers can replicate experiments and escalate the arms race to develop more evolved and robust ML applications. The paper provides taxonomies for: 1) the dataset, 2) the ML architecture, 3) the adversary's knowledge, capability, and goal, 4) adversary's strategy, and 5) the defense response. In addition, the relationships among these models and taxonomies are analyzed by proposing an adversarial machine learning cycle. The provided models and taxonomies are merged to form a comprehensive system-driven taxonomy, which represents the arms race between the ML applications and adversaries in recent years. The taxonomies encode best practices in the field and help evaluate and compare the contributions of research works and reveals gaps in the field.
Collapse
Affiliation(s)
- Koosha Sadeghi
- IMPACT lab (http://impact.asu.edu/), CIDSE, Arizona State University, Tempe, Arizona, USA, 85281
| | - Ayan Banerjee
- IMPACT lab (http://impact.asu.edu/), CIDSE, Arizona State University, Tempe, Arizona, USA, 85281
| | - Sandeep K S Gupta
- IMPACT lab (http://impact.asu.edu/), CIDSE, Arizona State University, Tempe, Arizona, USA, 85281
| |
Collapse
|
13
|
Maestre Vidal J, Sotelo Monge MA. Obfuscation of Malicious Behaviors for Thwarting Masquerade Detection Systems Based on Locality Features. Sensors (Basel) 2020; 20:s20072084. [PMID: 32272806 PMCID: PMC7181010 DOI: 10.3390/s20072084] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/04/2020] [Revised: 03/30/2020] [Accepted: 04/03/2020] [Indexed: 11/16/2022]
Abstract
In recent years, dynamic user verification has become one of the basic pillars for insider threat detection. From these threats, the research presented in this paper focuses on masquerader attacks, a category of insiders characterized by being intentionally conducted by persons outside the organization that somehow were able to impersonate legitimate users. Consequently, it is assumed that masqueraders are unaware of the protected environment within the targeted organization, so it is expected that they move in a more erratic manner than legitimate users along the compromised systems. This feature makes them susceptible to being discovered by dynamic user verification methods based on user profiling and anomaly-based intrusion detection. However, these approaches are susceptible to evasion through the imitation of the normal legitimate usage of the protected system (mimicry), which is being widely exploited by intruders. In order to contribute to their understanding, as well as anticipating their evolution, the conducted research focuses on the study of mimicry from the standpoint of an uncharted terrain: the masquerade detection based on analyzing locality traits. With this purpose, the problem is widely stated, and a pair of novel obfuscation methods are introduced: locality-based mimicry by action pruning and locality-based mimicry by noise generation. Their modus operandi, effectiveness, and impact are evaluated by a collection of well-known classifiers typically implemented for masquerade detection. The simplicity and effectiveness demonstrated suggest that they entail attack vectors that should be taken into consideration for the proper hardening of real organizations.
Collapse
Affiliation(s)
- Jorge Maestre Vidal
- Indra, Digital Labs, Av. de Bruselas, 35, Alcobendas, 28108 Madrid, Spain
- Correspondence: (J.M.V.); (M.A.S.M.)
| | - Marco Antonio Sotelo Monge
- Faculty of Engineering and Architecture, Universidad de Lima, Avenida Javier Prado Este, Lima 4600, Peru
- Correspondence: (J.M.V.); (M.A.S.M.)
| |
Collapse
|