Any anomaly-based intrusion detection system (AIDS) is subject to mimicry attacks. Tan
et al. [Tan et al., 2002] identified two mechanisms for performing mimicry attacks: (1) contaminating the learning and/or model update process by inserting attack data into normal user data, and (2) intertwining attacks with normal user activity so that the attacks
go undetected, which is also known as an evasion attack. I assume that the classifier training process has not been subject to a poisoning or data contamination attack, and
focus the analysis here on evasion attacks. Wagner and Soto listed six types of evasion attacks against host-based intrusion detection systems using a malicious sequence of user
CHAPTER 5. DIVERSIFYING DETECTION APPROACHES
(a) Modeling Using Feature Vectors per 2-minute Quanta
(b) Modeling Using Feature Vectors per 5-minute Quanta
(c) Modeling Using Feature Vectors per 15-minute Quanta
Figure 5.2: AUC Comparison By User Model for the Search Profiling and Integrated De- tection Approaches
commands or system calls as previously noted in Section 2.5. Most of these attacks do not
apply to the RUU sensor, as I model only frequencies of user activities and not sequences of user commands. Below I discuss whether the RUU sensor is vulnerable to these attacks.
1. The lip under the radar attack: The attacker avoids causing any change to the observ- able behavior of the application, i.e. he or she does not launch any processes that the
legitimate user would not normally run. The attacker uses already running processes only, and at the same rate, as the victim user. This is equivalent to the ‘slow-and-low’
CHAPTER 5. DIVERSIFYING DETECTION APPROACHES
Figure 5.3: False Alert Frequency By Modeling Epoch Length
attack. When trying to evade the RUU sensor, the attacker would then have to refrain
from engaging in any large-scale search activities. He or she would also have to access files at a slow rate, so that his or her activity goes undetected. Consequently, it would
take them longer to find any interesting or relevant information that could be stolen. Although masquerade activity in this case would take much longer, the adversary is
still very likely to access decoy files. Based on the human subject studies described in Chapter 4, every attacker who accessed the victim’s system was detected within 10
minutes, regardless of his or her search behavior. The user study results demonstrated that 90% of masqueraders could be detected by monitoing decoy files accesses alone
with 98% confidence as noted in Chapter 4.
2. The be patient attack: The attacker waits for the time when the malicious sequence
is accepted. The RUU sensor does not implement sequence-based modeling, and, therefore, is not vulnerable to this type of attack.
3. The be patient, but make your own luck attack: The same argument applied to the previous attack is valid also for this attack.
4. The replace system call parameters attack: This is also not applicable to the RUU sensor as the detection approach does not rely on monitoring system calls.
CHAPTER 5. DIVERSIFYING DETECTION APPROACHES
5. The insert no-ops attack: Since the features used by the USB sensor are based on frequencies of certain user-initiated activities and processes rather than on modeling
sequences of system calls, inserting does not have a significant impact on the RUU sensor, and can potentially only slow down user activities. An extreme case of this
could turn into a slow-and-low attack, which I discussed under the first type of evasion attacks.
6. The generate equivalent attacks attack: An attacker might decide to load the entire search index file at once into memory and directly read it instead of searching for files
using the desktop search tool user interface. This would reduce the number of user search actions detected by the RUU sensor, and may impact its ability to detect the
attacker’s fraudulent activity if relying on modeling search behavior only. Again in this case, monitoring accesses to decoy files becomes more significant in detecting the
attacker’s activity.
I conjectured that combining the baiting technique with the user search behavior pro-
filing technique serves as a defense mechanism against mimicry attacks, or evasion attacks in particular. I assume that user models and training data were not contaminated with
masquerader data during the model training or update phases. In order to validate my conjecture, one would ideally have a masquerader mimic a legitimate user’s behavior. How-
ever, when simulating masquerade attacks as described in the ‘capture-the-flag’ exercise, it was extremely difficult to make the volunteers participating in the user study mimic the
behavior of a specific user. To evaluate my conjecture though, I reviewed all search be- havior models and identified the user who had the most similar search behavior to that
exhibited by masquerade attackers. To identify this user, I measured the similarity between the legitimate user behavior and masquerade behavior by applying the probability product kernel to the distribution of their feature vectors [Jebara et al., 2004]. User 13 showed the
closest behavior to that of masqueraders as can be seen from Figure 5.4. This figure depicts the distribution of the three search-related features for user 13, and for all masqueraders
combined. I can support my conjecture by reviewing the results achieved by this user’s model in Figure 5.2, which are indeed significantly better than the results achieved using
CHAPTER 5. DIVERSIFYING DETECTION APPROACHES
the search behavior modeling technique alone, particularly when I extend the monitoring and modeling window to 15 minutes.
Figure 5.2(c) clearly shows how all attacks are detected for user 13 when using the combined detection approach, while a high number of false positives are still recorded if
only the search profiling approach is used.
One might expect that hardening the detector against mimicry attacks could drive
higher FP rates. The experimental results show the opposite effect. Figure 5.5 helps in understanding how this can be achieved. When using the search profiling approach only,
the circular point above the threshold line in Figure 5.5(a) triggers a false positive. If I use a lower threshold beyond which search behavior is considered suspicious as in Figure 5.5(b),
I can widen the anomaly space for the detector. This in turn means that the adversary has to work harder in order to faithfully mimic the legitimate user’s behavior. However,
this alone may introduce false positives. By combining search profiling with the baiting technique, I can use a second threshold for the highly abnormal search behavior, beyond
which a 100% TP rate is achieved. For points that fall in the ‘ambiguous’ space between the two thresholds, the access to decoy information can be used to inform the classification
decision. The key to this process is the use of decoy documents that are strategically placed, highly-enticing and conspicuous in the file system, so that the attacker is very likely to touch
them. The ‘capture the-flag’ exercise showed that all masqueraders did indeed touch at least one of the placed decoy files as reported in Figure 4.10.