1
|
Scott BA, Johnstone MN, Szewczyk P. A Survey of Advanced Border Gateway Protocol Attack Detection Techniques. SENSORS (BASEL, SWITZERLAND) 2024; 24:6414. [PMID: 39409453 PMCID: PMC11479385 DOI: 10.3390/s24196414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 09/14/2024] [Accepted: 09/27/2024] [Indexed: 10/20/2024]
Abstract
The Internet's default inter-domain routing system, the Border Gateway Protocol (BGP), remains insecure. Detection techniques are dominated by approaches that involve large numbers of features, parameters, domain-specific tuning, and training, often contributing to an unacceptable computational cost. Efforts to detect anomalous activity in the BGP have been almost exclusively focused on single observable monitoring points and Autonomous Systems (ASs). BGP attacks can exploit and evade these limitations. In this paper, we review and evaluate categories of BGP attacks based on their complexity. Previously identified next-generation BGP detection techniques remain incapable of detecting advanced attacks that exploit single observable detection approaches and those designed to evade public routing monitor infrastructures. Advanced BGP attack detection requires lightweight, rapid capabilities with the capacity to quantify group-level multi-viewpoint interactions, dynamics, and information. We term this approach advanced BGP anomaly detection. This survey evaluates 178 anomaly detection techniques and identifies which are candidates for advanced attack anomaly detection. Preliminary findings from an exploratory investigation of advanced BGP attack candidates are also reported.
Collapse
Affiliation(s)
- Ben A. Scott
- School of Science, Edith Cowan University, Perth, WA 6027, Australia (P.S.)
- School of Science, Engineering & Technology, RMIT University, Ho Chi Minh City 700000, Vietnam
| | | | - Patryk Szewczyk
- School of Science, Edith Cowan University, Perth, WA 6027, Australia (P.S.)
| |
Collapse
|
2
|
Yang X, Zhuang Y, Shi M, Cao X, Chen D, Tang Y. SPiForest: An Anomaly Detecting Algorithm Using Space Partition Constructed by Probability Density-Based Inverse Sampling. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:8013-8025. [PMID: 36449578 DOI: 10.1109/tnnls.2022.3223342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
The SPiForest, a new isolation-based approach to outlier detection, constructs iTrees on the space containing all attributes by probability density-based inverse sampling. Most existing iForest (iF)-based approaches can precisely and quickly detect outliers scattering around one or more normal clusters. However, the performance of these methods seriously decreases when facing outliers whose nature "few and different" disappears in subspace (e.g., anomalies surrounded by normal samples). To solve this problem, SPiForest is proposed, which is different from existing approaches. First, SPiForest uses the principal component analysis (PCA) to find principal components and estimate each component's probability density function (pdf). Second, SPiForest utilizes the inv-pdf, which is inversely proportional to the pdf estimated from the given dataset, to generate support points in the space containing all attributes. Third, the hyperplane decided by these support points is used to isolate the outliers in the space. Next, these steps are repeated to build an iTree. Finally, many iTrees construct a forest for outlier detection. SPiForest provides two benefits: 1) it isolates outliers with fewer hyperplanes, which significantly improves the accuracy and 2) it effectively detects the outliers whose nature "few and different" disappears in subspace. Comparative analyses and experiments show that the SPiForest achieves a significant improvement in terms of area under the curve (AUC) when compared with the state-of-the-art methods. Specifically, our method improves by at most 17.7% on AUC when compared to iF-based algorithms.
Collapse
|
3
|
Yu Q, Li C, Zhu Y, Kurita T. Convolutional autoencoder based on latent subspace projection for anomaly detection. Methods 2023; 214:48-59. [PMID: 37120080 DOI: 10.1016/j.ymeth.2023.04.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 03/16/2023] [Accepted: 04/23/2023] [Indexed: 05/01/2023] Open
Abstract
Image anomaly detection (AD) is widely researched on many occasions in computer vision tasks. High-dimensional data, such as image data, with noise and complex background is still challenging to detect anomalies under the situation that imbalanced or incomplete data are available. Some deep learning methods can be trained in an unsupervised way and map the original input into low-dimensional manifolds to predict larger differences in anomalies according to normal ones by dimension reduction. However, training a single low-dimension latent space is limited to present the low-dimensional features due to the fact that the noise and irreverent features are mapped into this space, resulting in that the manifolds are not discriminative for detecting anomalies. To address this problem, a new autoencoder framework is proposed in this study with two trainable mutually orthogonal complementary subspaces in the latent space, by latent subspace projection (LSP) mechanism, which is named as LSP-CAE. Specifically, latent subspace projection is used to train the latent image subspace (LIS) and the latent kernel subspace (LKS) in the latent space of the autoencoder-like model respectively, which can enhance learning power of different features from the input instance. The features of normal data are projected into the latent image subspace, while the latent kernel subspace is trained to extract the irrelevant information according to normal features by end-to-end training. To verify the generality and effectiveness of the proposed method, we replace the convolutional network with the fully-connected network contucted in the real-world medical datasets. The anomaly score based on projection norms in two subspace is used to evaluate the anomalies in the testing. Consequently, our proposed method can achieve the best performance according to four public datasets in comparison of the state-of-the-art methods.
Collapse
Affiliation(s)
- Qien Yu
- School of Information Science and Engineering, Chongqing Jiaotong University, Chongqing, 400074, China.
| | - Chen Li
- Graduate School of Engineering, Nagoya University, Nagoya, 464-8603, Japan.
| | - Ye Zhu
- School of Information Technology, Deakin University, Victoria 3125, Australia.
| | - Takio Kurita
- Graduate School of Advanced Science and Engineering, Hiroshima University, Higashi-hiroshima, Hiroshima, 739-8521, Japan.
| |
Collapse
|
4
|
Intelligent Fault Diagnosis of Industrial Robot Based on Multiclass Mahalanobis-Taguchi System for Imbalanced Data. ENTROPY 2022; 24:e24070871. [PMID: 35885094 PMCID: PMC9317314 DOI: 10.3390/e24070871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 06/22/2022] [Accepted: 06/23/2022] [Indexed: 11/17/2022]
Abstract
One of the biggest challenges for the fault diagnosis research of industrial robots is that the normal data is far more than the fault data; that is, the data is imbalanced. The traditional diagnosis approaches of industrial robots are more biased toward the majority categories, which makes the diagnosis accuracy of the minority categories decrease. To solve the imbalanced problem, the traditional algorithm is improved by using cost-sensitive learning, single-class learning and other approaches. However, these algorithms also have a series of problems. For instance, it is difficult to estimate the true misclassification cost, overfitting, and long computation time. Therefore, a fault diagnosis approach for industrial robots, based on the Multiclass Mahalanobis-Taguchi system (MMTS), is proposed in this article. It can be classified the categories by measuring the deviation degree from the sample to the reference space, which is more suitable for classifying imbalanced data. The accuracy, G-mean and F-measure are used to verify the effectiveness of the proposed approach on an industrial robot platform. The experimental results show that the proposed approach’s accuracy, F-measure and G-mean improves by an average of 20.74%, 12.85% and 21.68%, compared with the other five traditional approaches when the imbalance ratio is 9. With the increase in the imbalance ratio, the proposed approach has better stability than the traditional algorithms.
Collapse
|
5
|
Majority-to-minority resampling for boosting-based classification under imbalanced data. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03585-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
6
|
Yu Q, Kavitha M, Kurita T. Autoencoder framework based on orthogonal projection constraints improves anomalies detection. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.04.033] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
7
|
|
8
|
Arashloo SR, Kittler J. Robust One-Class Kernel Spectral Regression. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:999-1013. [PMID: 32481229 DOI: 10.1109/tnnls.2020.2979823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The kernel null-space technique is known to be an effective one-class classification (OCC) technique. Nevertheless, the applicability of this method is limited due to its susceptibility to possible training data corruption and the inability to rank training observations according to their conformity with the model. This article addresses these shortcomings by regularizing the solution of the null-space kernel Fisher methodology in the context of its regression-based formulation. In this respect, first, the effect of the Tikhonov regularization in the Hilbert space is analyzed, where the one-class learning problem in the presence of contamination in the training set is posed as a sensitivity analysis problem. Next, the effect of the sparsity of the solution is studied. For both alternative regularization schemes, iterative algorithms are proposed which recursively update label confidences. Through extensive experiments, the proposed methodology is found to enhance robustness against contamination in the training set compared with the baseline kernel null-space method, as well as other existing approaches in the OCC paradigm, while providing the functionality to rank training samples effectively.
Collapse
|
9
|
Sabokrou M, Fathy M, Zhao G, Adeli E. Deep End-to-End One-Class Classifier. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:675-684. [PMID: 32275608 DOI: 10.1109/tnnls.2020.2979049] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
One-class classification (OCC) poses as an essential component in many machine learning and computer vision applications, including novelty, anomaly, and outlier detection systems. With a known definition for a target or normal set of data, one-class classifiers can determine if any given new sample spans within the distribution of the target class. Solving for this task in a general setting is particularly very challenging, due to the high diversity of samples from the target class and the absence of any supervising signal over the novelty (nontarget) concept, which makes designing end-to-end models unattainable. In this article, we propose an adversarial training approach to detect out-of-distribution samples in an end-to-end trainable deep model. To this end, we jointly train two deep neural networks, R and D . The latter plays as the discriminator while the former, during training, helps D characterize a probability distribution for the target class by creating adversarial examples and, during testing, collaborates with it to detect novelties. Using our OCC, we first test outlier detection on two image data sets, Modified National Institute of Standards and Technology (MNIST) and Caltech-256. Then, several experiments for video anomaly detection are performed on University of Minnesota (UMN) and University of California, San Diego (UCSD) data sets. Our proposed method can successfully learn the target class underlying distribution and outperforms other approaches.
Collapse
|
10
|
Wu P, Liu J, Shen F. A Deep One-Class Neural Network for Anomalous Event Detection in Complex Scenes. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:2609-2622. [PMID: 31494560 DOI: 10.1109/tnnls.2019.2933554] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
How to build a generic deep one-class (DeepOC) model to solve one-class classification problems for anomaly detection, such as anomalous event detection in complex scenes? The characteristics of existing one-class labels lead to a dilemma: it is hard to directly use a multiple classifier based on deep neural networks to solve one-class classification problems. Therefore, in this article, we propose a novel DeepOC neural network, termed as DeepOC, which can simultaneously learn compact feature representations and train a DeepOC classifier. Only with the given normal samples, we use the stacked convolutional encoder to generate their low-dimensional high-level features and train a one-class classifier to make these features as compact as possible. Meanwhile, for the sake of the correct mapping relation and the feature representations' diversity, we utilize a decoder in order to reconstruct raw samples from these low-dimensional feature representations. This structure is gradually established using an adversarial mechanism during the training stage. This mechanism is the key to our model. It organically combines two seemingly contradictory components and allows them to take advantage of each other, thus making the model robust and effective. Unlike methods that use handcrafted features or those that are separated into two stages (extracting features and training classifiers), DeepOC is a one-stage model using reliable features that are automatically extracted by neural networks. Experiments on various benchmark data sets show that DeepOC is feasible and achieves the state-of-the-art anomaly detection results compared with a dozen existing methods.
Collapse
|
11
|
|
12
|
Livi L, Alippi C. One-Class Classifiers Based on Entropic Spanning Graphs. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2017; 28:2846-2858. [PMID: 28114079 DOI: 10.1109/tnnls.2016.2608983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
One-class classifiers offer valuable tools to assess the presence of outliers in data. In this paper, we propose a design methodology for one-class classifiers based on entropic spanning graphs. Our approach also takes into account the possibility to process nonnumeric data by means of an embedding procedure. The spanning graph is learned on the embedded input data, and the outcoming partition of vertices defines the classifier. The final partition is derived by exploiting a criterion based on mutual information minimization. Here, we compute the mutual information by using a convenient formulation provided in terms of the -Jensen difference. Once training is completed, in order to associate a confidence level with the classifier decision, a graph-based fuzzy model is constructed. The fuzzification process is based only on topological information of the vertices of the entropic spanning graph. As such, the proposed one-class classifier is suitable also for data characterized by complex geometric structures. We provide experiments on well-known benchmarks containing both feature vectors and labeled graphs. In addition, we apply the method to the protein solubility recognition problem by considering several representations for the input samples. Experimental results demonstrate the effectiveness and versatility of the proposed method with respect to other state-of-the-art approaches.
Collapse
|
13
|
Gu B, Sun X, Sheng VS. Structural Minimax Probability Machine. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2017; 28:1646-1656. [PMID: 27101618 DOI: 10.1109/tnnls.2016.2544779] [Citation(s) in RCA: 106] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Minimax probability machine (MPM) is an interesting discriminative classifier based on generative prior knowledge. It can directly estimate the probabilistic accuracy bound by minimizing the maximum probability of misclassification. The structural information of data is an effective way to represent prior knowledge, and has been found to be vital for designing classifiers in real-world problems. However, MPM only considers the prior probability distribution of each class with a given mean and covariance matrix, which does not efficiently exploit the structural information of data. In this paper, we use two finite mixture models to capture the structural information of the data from binary classification. For each subdistribution in a finite mixture model, only its mean and covariance matrix are assumed to be known. Based on the finite mixture models, we propose a structural MPM (SMPM). SMPM can be solved effectively by a sequence of the second-order cone programming problems. Moreover, we extend a linear model of SMPM to a nonlinear model by exploiting kernelization techniques. We also show that the SMPM can be interpreted as a large margin classifier and can be transformed to support vector machine and maxi-min margin machine under certain special conditions. Experimental results on both synthetic and real-world data sets demonstrate the effectiveness of SMPM.
Collapse
|