• No results found

4.3 Data preprocessing for CMU-CERT Insider Threat Data Sets

5.4.2 Description of the Experiments

Table 5.1 presents the variety of experiments carried out to evaluate the performance of the proposed technique AMOTRE: without trapper removal; with trapper re- moval; with decomposition of the majority class M only; or with decomposition of the majority classM and the minority classI. These experiments are also per- formed over SMOTE to compare with AMOTRE using the measures defined later on: F1 measure, TPT, FP. Note that the procedure of trapper removal is not ap- plied with SMOTE, because it was first introduced as an essential part (-TRE part) in the proposed AMO-TRE technique. In the experiments, we analyse the influence of the -TRE part on our technique by applying the oversampling: (1) without trap- per removal (AMO-na), and (2) with trapper removal (AMO-TRE). We then analyse the influence of introducing class decomposition variations to each of SMOTE and AMOTRE.

Each of the experiments is evaluated on five base classification methods: Ran- dom Forest (rf); Extreme Gradient Boosting (xgb); Support Vector Machines with linear kernel (svmL), polynomial kernel (svmP), and radial basis function kernel (svmR).

The experiments are tuned for different values of parameters as shown in Table 5.2. Note that an extensive number of experiments was done to select the presented tuning values for the parameters. The values were selected based on the experiments achieving the best performance in terms of the evaluation measures described below.

Regarding class decomposition, we tuned the number of clusters fork={2,4,6} for both (1) the decomposition of the majority classM, and (2) the decomposition of the minority classI. However, the results for onlyk=2are reported, due to revealing better performance. The proposed approach was able to detect most of the malicious

78 Chapter 5. Supervised Learning for Imbalanced Insider Threat Detection

TABLE5.2: Tuned Parameters for SMOTE and AMOTRE. k={2,4,6} number of clusters

perc.over={200,300,400} percentage of oversampling

τ=10 survival threshold forAtin -TRE part prob+={0.2,0.5,0.8} controlled by the direction ofntf0 λ={0.3,1} controlled bydirSandprob+

insider threats in the data sets fork=2. The literature demonstrated the effectiveness of k-means clustering for small values ofk when applied for class decomposition [104], [109].

As explained in Section 4.3.2, a malicious insider threat comprises a complex pattern of anomalous behaviours carried out by a malicious insider. For instance, in community com-P, we have17malicious insider threats. These malicious insider threats are associated to 366 anomalous instances (behaviours) that make up the minority classI. Thus, when applying the class decomposition of the minority class I, we are actually clustering the anomalous training instances from the366instances appended to the artificial setIs; not from the17malicious insider threats.

Regarding SMOTE technique, the oversampling percentage is tuned forperc.over ={200,300,400} to assess whether increasing the percentage of generated artificial samples to the minority class I improves the performance of SMOTE. The tuned values ofperc.overare selected to achieve an approximate balance between the sub- classes (clusters). Similarly, our AMOTRE technique is tuned forperc.over={200,300,400} to generate an equal number of artificial samples to that generated in SMOTE, in or- der to ensure a fair comparison between SMOTE and AMOTRE. Note that we report the results for onlyperc.over=200, due to achieving the lowest number of false posi- tives (FP); which is the ultimate aim of our work.

The -TRE part is tuned for only a small valueτ=10to test the influence of remov- ing trapper instances from the minority class instances on the overall performance of AMOTRE. We select the survival thresholdτ=10 so that each minority instance At ∈ I having a percentile rank perclofMt < τ is considered as a trapper instance and removed (as detailed in Section 5.3.2).

The details about tuningprob+andλ, for the displayed values in Table 5.2, can be found in the description of AMOTRE technique in Section 5.3.

5.4. Experiments 79

For evaluation, we performed the experiments with 2-fold cross validation and 10-fold cross validation on each of the communities respectively. The 10-fold cross validation allows the base classification methods to learn on a 90% subsample of the data, and test on a 10% subsample repetitively for10 times. This would best pro- vide a comprehensive analysis of the performance of each experiment. However, the complexity of the insider threat problem is reflected in the scarcity of anoma- lous instances (minority instances) available in the CMU-CERT data. In the 10- fold cross validation, a 10% subsample testing data consists of a limited number of anomalous instances (behaviours), as well as a small number of normal instances (behaviours). Given this, it was challenging to show the merit of our approach com- pared to SMOTE experiments in terms of the number of FPs and the number of detected malicious insider threats. A 2-fold cross validation is also applied so that the base classification methods learn on a 50% subsample of the data, and test on a 50% subsample repetitively for2times. This allows the 50% subsample testing data to have more instances, including more anomalous instances (behaviours). Hence, it helps us to reveal the superiority of the performance of AMOTRE experiments compared to SMOTE experiments.

Whether in 2-fold cross validation or in 10-fold cross validation, each fold may contain a subset of the anomalous behaviours belonging to a malicious insider threat. In terms of the training, this provides ‘weak supervision’, as some of the anomalous behaviours associated to a particular threat will be missing in the training fold(s). In terms of the testing, this would show the ‘robustness’ of the approach being able to detect the malicious insider threat whose behaviours are partially represented in the training fold(s), even with a weak signal (threats are partially represented in the test fold(s)). The results for both 2-fold cross validation and 10-fold cross validation are reported in this Chapter.

We can sum up that70experiments are presented per community in terms of7

types of experiments,5base classification methods, and2vs10fold cross validation. Regarding the tuning of parameterskandperc.over, we carried out420experiments per community.

80 Chapter 5. Supervised Learning for Imbalanced Insider Threat Detection