• No results found

6.3 Adaptive One-Class Ensemble-based Anomaly Detection

6.3.1 One-class Modelling Component

The one-class modelling component acquires training data in the initial modelling phase. It consists of a clustering component that appliesk-means clustering method on the normal class data in order to createkclusters. It then trains each cluster on

98 Chapter 6. Anomaly Detection for Insider Threat Detection

FIGURE 6.1: Proposed conceptual framework for anomaly detec- tion. The framework has two components: a one-class modelling component which includes a clustering component, and a progres- sive update component which includes an oversampling component.

{M1, M2, ..., Mk}represent the set of models generated by the ensem-

ble. The set of blue arrows represent the testing instances declared as FPs, and each segment of blue arrows represents an FP chunk ac-

quired sequentially.

a base method. The result is an ensemble of a base method overkclusters, result- ing inkmodels. In this work, we utilised two high-performing anomaly detection methods, ocsvm and iForest, as base methods in the proposed framework to detect malicious insider threats.

Clustering Component In Chapter 5, class decomposition was used to weaken the effect of the majority class in classification. However, in this chapter, we aim to cluster the normal class data to identify patterns in data, then build a one-class classifier on each cluster. In this way, the local anomalies can be identified, which is not the case if we used normal data as a whole.

Malicious insiders have authorised access to the network, system, and data, and are aware of the system management and security policies. These aspects aid the malicious insider to deceive the detection system, where some anomalous behaviour may have a high resemblance with the normal user’s behaviour. This manifests as local anomalous instances located among normal instances. To address this issue,

6.3. Adaptive One-Class Ensemble-based Anomaly Detection 99

we apply class decomposition [77] on the normal class data. The idea is to decom- pose the normal class data into clusters and to train a detector per cluster, giving more opportunity for the detector to identify local anomalous instances with respect to a cluster that might not be detected over the whole data. We utilisek-means clus- tering method to identify patterns in the normal class data, given its efficiency. More information regarding the literature of class decomposition and the argument for selecting thek-means clustering method can be found in Section 5.3.1.

LetXt={xt1, xt2, ..., xtm}represent the feature vector at session slott, wherextf; 1 f mrepresents the value of thefthfeature. Letyrepresent the normal class label. Each instance (i.e. feature vector)Xt either belongs or does not belong to normal classy. LetN=Xt∀t;Xt∈yrepresent the set of instances that belong to the normal classy. If we applyk-means clustering method on the setN, thenN decomposes intokclusters. LetC={C1, C2, ..., Ck}represent the set ofkclusters.

Fig. 6.2 represents the normal instances with blue circles, and the anomalous in- stances with squares. Let the solid-line outer circle represent the decision boundary generated by the base method over the whole normal data to separate the normal instances from the anomalous instances. Considerk=2, so that the normal data in- stances are grouped into 2 clusters. We construct an ensemble ofk base methods and train a base method on each cluster of thekclusters. Let the inner dashed cir- cles represent the decision boundaries generated by each cluster’s base method. Fig. 6.2 revealstwo types of anomalous instancesdefined below:

• Borderline and outlier instances: represented by red filled squares. Those in- stances are located at the borderline with respect to the solid-line outer deci- sion boundary, or are far outliers; and

• Trapper instances: represented by red empty squares. Those instances are lo- cated in a sparse or dense area of normal instances.

The borderline and outlier anomalous instances can be easily detected by one of the aforementioned base methods trained over the whole normal data. However, as shown in Fig. 6.2, the solid-line outer decision boundary would not be able to de- tect trapper instances. As aforementioned, the trapper instances are located among

100 Chapter 6. Anomaly Detection for Insider Threat Detection

FIGURE6.2: Clustering normal class instances. The blue circles rep- resent the normal instances, and the squares represent the anomalous instances. The red filled squares represent borderline and outlier in- stances, and red empty squares trapper instances. The solid-line outer circle represents the decision boundary generated by the base method over the whole normal data. The inner dashed circles represent the

decision boundaries generated by each cluster’s base method.

normal instances, and therefore the base method would declare as normal instances (e.g. iForest would not assign a high anomaly score).

To address this issue, we propose to decompose the normal class data into clus- ters and to train an ensemble of a base method per cluster, so that the trapper in- stances can be identified by the base method(s) as anomalous with respect to clus- ter(s). Fig. 6.2 shows that the inner dashed decision boundaries, generated by clus- ters, can detect the trapper instances located in the disjoint area (i.e. sparse area of normal instances). Nevertheless, some trapper instances would still exist inside the clusters which may not be detected as positives, due to their existence in an inner dense area of normal instances.

Upon decomposing the normal training data set intokclusters, the role of one- class modelling component is to train an ensemble of a base method on thekclusters to generate k initial models. LetM={M1, M2, ..., Mk} represent the set of models generated by the ensemble. Each initial modelMi forCi; 1 ≤ i ≤ kis then used to detect the malicious insider threats in the testing data set. The decisiondti for a testing instanceXt with respect toMi comes in Boolean form of {True, False}. In case of ocsvm, an instanceXteither belongs or does not belong to the normal class y. In case of iForest, Xtis identified as an anomalous instance if its anomaly score is greater than a defined anomaly score thresholdτ. The parameterτ requires to be tuned, where an optimal isolation of anomalous instances is achieved at a certain thresholdτ, as examined later in Section 6.4.

6.3. Adaptive One-Class Ensemble-based Anomaly Detection 101

After that, the ensemble acquires a set of decisionsDt={dt1, dt2, ..., dtk}for a testing instanceXtto vote whetherXtis normal or anomalous. Each decisiondt

i ∈Dt; 1 ≤ i≤kis taken by a modelMifor a clusterCiin the ensemble. The voting mechanism is executed as follows: (1) Ifdti ∈Dtvotes for anomalous∀i(i.e. by all models), then the overall decisionDtdeclaresXtas anomalous behaviour, and consequently flags an alarm warning of a malicious insider threat; (2) If∃dti∈Dtvotes for normal, then the overall decisionDtdeclaresXtas normal behaviour.