2.3 Machine Learning Approaches
2.3.2 Behaviour-based Approaches
2.3.2.1 Classification
In the case of the availability of both normal and anomalous data (i.e., data maturity is high), the system is capable to learn on both classes. Hence, the insider threat de- tection problem is addressed using a two-class classification approach. In what fol- lows, a review of the two-class classification approaches proposed for insider threat detection is provided.
Mayhew et al. [42] describe a Behaviour-Based Access Control (BBAC) approach that analyses user behaviour at the network layer, the HTTP request layer, and the document layer to detect behaviour changes. BBAC constructs multiple Support Vector Machine (SVM) classifiers, each analyses a certain type of user behaviour in real enterprise data such as, TCP connection behaviour, HTTP request behaviour, and text behaviour, to classify data as normal or malicious insider threat. BBAC combinesk-means clustering and C4.5 decision tree in its approach to improve scal- ability and reduce false positives. The paper reports a variety of results depending on the data type of the malicious behaviour detected, where the TP rate=76−99.6%
16 Chapter 2. Approaches for Insider Threat Detection
Punithavathani et al. [43] present an Insider Attack Detection System (IADS) based on the supervised k-Nearest Neighbours (k-NN) to counter insider threats in critical networks. IADS consists of two engines: a network traffic engine that monitors and captures incoming and outgoing packets in traffic, then extracts the important features and stores in a database; and a log analyser engine that mines user logs, extracts patterns of user behaviour, and appliesk-NN to identify normal and anomalous patterns based on the closest training patterns in history. In order to verify the detection of malicious threats, IADS employs the computational intel- ligence to cross check the anomalous patterns (detected by the log analyser engine) with the network features (extracted by the network traffic engine).
Azaria et al. [44] present a Behavioural Analysis of Insider Threat (BAIT) frame- work that contains bootstrapping algorithms built on top of SVM or Naives Bayes classifiers. The BAIT algorithms include: an SVM classifier with an update proce- dure using the topkmalicious users; an SVM classifier with an update procedure using the topkmalicious users and the topl·kbenign users; an SVM classifier with aniterativeupdate procedure using either a malicious or an honest user one at a time; and a Naive Bayes classifier with aniterativeupdate procedure using the recent prob- abilities. The algorithm with SVM classifier and iterative update procedure reports the best results with F-measure=0.4, and FP=14(FP rate=7.36%), however, the base- line SVM classifier reports F-measure=0.273, but much lower FP=9(FP rate=4.73%). Sen [45] suggests a semi-supervised approach, namely Instance Weighted Naive Bayes (IWNB), with the aim to reduce the FP rate. In this context, IWNB trains the model with labelled train instances, and then updates (retrains) the model with all unlabelled test instances regardless of the label assigned (predicted) for these test instances by Naive Bayes. The authors claim that the shortcoming of relying on the decision of Naive Bayes is that the updated model will not consider the FPs. To address this shortcoming, IWNB considers all unlabelled test instances and assigns weights (probabilities) to them based on their similarity to labelled instances, so that new instances are assigned lower weights than old instances. In this way, the model adapts to the normal change in user’s behaviour, which will consequently reduce the FP rate. Although this approach outperforms other existing approaches, including traditional Naive Bayes, as the number of appended FNs increases, it is
2.3. Machine Learning Approaches 17
not convincing to append FNs to the updated user model. The comparison of IWNB to other approach flags false alarms=4.5for IWNB, compared to false alarms=1.3in another approach.
Axelrad et al. [46] presented a Bayesian Network (BN) approach to predict in- sider threats based on psychological features of malicious insiders. The BN model is constructed, such that the nodes represent the categories of the psychological fea- tures, and the directed links between the nodes represent the pairwise correlation ratio between the categories (i.e. an estimation of how much a pair of categories are correlated or inversely correlated). The paper presents an experimental approach with a revised BN, where error rate=64.81%and logarithmic loss=1.344report lower values than original BN (error rate=66.46%and logarithmic loss=1.427). Although the authors claim a considerable improvement, the values of the error rate and the logarithmic loss for the revised BN are still too high.
Barrios [47] develops a multi-level framework, called database Intrusion Detec- tion System (dIDS), to detect malicious transactions. dIDS employs: at level ‘1’ Apriori Hybrid algorithm to check the match between the new transaction and the prior user’s transactions logged in the dIDS database; at level ‘2’ Stochastic Gradi- ent Boosting to find the probability of an anomaly; and at level ‘3’ BN to model the data giving a conditional probability of anomaly to a new transaction based on prior user’s transactions. A new transaction is assessed at each level in turn. If it is flagged as anomalous, it is appended to the dIDS database and the assessment is terminated, otherwise, it is assessed at the next level. The multi-level approach benefits from the hybrid utilisation of different algorithms, as well as allows to reduce the consuming time when a transaction is flagged anomalous at an earlier level. The probability of detection reported by the framework is =38.38%which is too low.
Kandias et al. [48] compare the performance of text classification approaches for YouTube user comments to detect malicious insiders (users) holding negative attitude towards law enforcement and authorities. The comments are defined into two categories: category (P), which hold negative attitude towards law enforcement and authorities; and category (N), which hold neutral attitude. The text classifica- tion aims to classify the comments into category (P) or (N) using: machine learning approach such as SVM, Multinomial Naive Bayes, or Logistic Regression (LR); a
18 Chapter 2. Approaches for Insider Threat Detection
dictionary-based approach with terms and phrases related to law enforcement; or a flat-data approach based on Naive Bayes, such that the comments are augmented with user data (public profile information). LR, followed by SVM, achieve a better performance compared to Naive Bayes in terms of precision, recall, F-measure, and accuracy. The dictionary-based approach reveals a significant deviation from LR which indicates the strength of LR in learning the correlations from the training set. However, the flat-data approach indicates the percentage of comments of category (P) a10%higher than that in LR.
Brdiczka et al. [49] present Graph Learning for Anomaly Detection using Psy- chological Context (GLAD-PC) to detect threats. GLAD-PC consists of two threads: structural anomaly detection thread which uses graph structure analysis (Random Forest) on network data to separate the normal from the anomalous; and psycho- logical profiling thread that constructs psychological profiles from behavioural data in order to track sudden changes in the psychological features. GLAD-PC then em- ploys a BN to fuse the anomalous data from each of the two threads. After that, a statistical inference method ranks the anomalous instances to decide whether to flag a threat alarm. The results show that the combination of the two threads outper- forms applying each thread on its own.