Use Data
Part III focuses on analyzing and understanding data collected from a computer and network system under attack or normal use conditions, especially discussing distinctive data character- istics which enable detection and identification of attack events. An event can be an activity, a state change, or a performance change which is a part of the cause–effect chain triggered by a given attack (see Chapter 1 for the description of a cause–effect chain of activity, state and performance in a resource–process–user interaction). A data characteristic of a given attack is a significant change in a feature of data observations for one or multiple data variables which appears at the time of one or more events in the cause–effect chain of the attack. Hence, three concepts are involved in defining a data characteristic of a given attack: data, feature, and characteristic.
Data collected from a computer and network system consists of data variables and their data observations which capture activities, state changes and performance changes on the system.
Chapter 2 gives examples of data variables, Network Interface\Packets/sec, Memory\Available
Bytes, and Process ( Total)\Page Faults/sec, which can be collected using the Windows Performance Objects. Chapter 2 also describes various facilities and tools on a Windows op- erating system to collect activity, state and performance data from a computer and network system. Among those facilities and tools, the Windows performance objects provide facilities to collect a comprehensive set of activity, state and performance data from a host computer, which enable the cause–effect chain of activities, state changes and performance changes trig- gered by an attack to be traced. Other facilities and tools on Windows collect primarily activity data without state and performance data. Hence, research reported in Part III investigates activity, state and performance data which is collected from computers using the Windows performance objects. Specific objects and data variables within each object are described in detail in Chapter 7.
In Part III, Windows performance objects data is collected under eleven attack conditions and two normal use conditions to provide attack and normal use data for investigation. Chapter 7 describes these attack and norm conditions in detail. Not all data variables, which can be collected from the Windows performance objects, capture specific activities, state changes and performance changes which are associated with a given attack. Only data variables, which are relevant to specific activities, state changes and performance changes in the cause–effect chain
Secure Computer and Network Systems: Modeling, Analysis and Design Nong Ye C
2008 John Wiley & Sons, Ltd
of a given attack, are useful for detecting events of the attack. Such data variables are identified for each of the eleven attacks in Chapters 8–11.
A feature is a measure of a property which exists in a sample of data observations for one or multiple data variables. Only univariate mathematical/statistical features—features of a data sample from one data variable—are investigated in Part III. These univariate mathematical/ statistical features include the statistical mean in Chapter 8, the probability distribution in Chapter 9, the autocorrelation in Chapter 10, and the wavelet-based signal strength in Chapter 11 covering the Haar wavelet, the Daubechies wavelet, the Derivative of Gaussian wavelet, the Paul wavelet and the Morlet wavelet. These wavelets are used to extract the time-frequency signal changes associated with the data patterns of step change, steady change, random change, spike change and sine-cosine wave with noise. Chapters 8–11 provide mathematical/statistical methods of extracting the mean, probability distribution, autocorrelation, and wavelet features from attack and normal use data. Among the four features, the distribution feature gives a more comprehensive picture of a data sample than the mean feature. Both the wavelet feature and the autocorrelation feature reveal relations of data observations over time. The autocorrelation feature focuses on the general autocorrelation aspect of time series data, whereas the wavelet feature focuses on special forms of time-frequency data patterns. Both various wavelet forms and various probability distributions are linked to certain data patterns. The distribution feature describes the general pattern of the data, whereas the wavelet feature reveals time locations and frequencies of special data patterns. Hence, the wavelet feature reveals more special data features than the distribution feature and the autocorrelation feature. Note that there are other types of univariate features (e.g., features extracting other trends or patterns of data) as well as multivariate features (i.e., features of data from multiple data variables) which are not investigated in Part III but may be useful in revealing data characteristics of various attacks.
If one or more events of a given attack cause a significant change in a specific feature of a data variable, this change is considered a data characteristic of the attack. Chapters 8–11 describe statistical tests to identify a significant change in a given data feature, and reveal data characteristics of eleven attacks in the mean, probability distribution, autocorrelation, and wavelet features. If a specific data characteristic appears during a given attack but not during other attacks or normal use activities, this data characteristic is considered a unique data characteristic of that attack and can be used to uniquely detect and identify this attack. Note that an event may manifest through more than one data characteristic (e.g., more than one data variable or more than one feature of a data variable). The identified attack characteristics in the mean, distribution, autocorrelation and wavelet features are used to uncover the similarity and difference of the attacks.
The data characteristics of attack and normal use activities discovered in Part III are essential to building attack detection models for detection accuracy and earliness. Attack detection models are covered in Parts IV–VI.