4.7 Comparison & Inference
4.7.4 Classification
The information gathered by these approaches consists of unlabelled performance metrics and configuration data. However, using this information to decide on the source of a fault first requires that this information be accurately classified – a non-trivial problem.
The state of the art for autonomously and accurately classifying unlabelled data in general is outside of the scope of this chapter. However, on occasion, problems in classifying unlabelled data are discussed in brief and as needed. This includes the strategies implemented by each of the aforementioned approaches, and their respective limitations.
There are a number of differences in how theUBL andFDF approaches classify data – from how much data is utilised, at what point the information is classified, and both how and when data is processed. These differences are associated with the relative uses of each framework although some properties are based on assumptions.
For example, the training phase forUBLis tested in situ before being applied. Using a training phase provides an advantage in that it does not require a specific set of performance tests or roles to be provided before classifying data. However, using performance tests follows a common tenet in self-managing systems research – the ability to provide high-level policies to systems as a primary form of administration. It also allows for the specification of specific areas of interest – an approach that can reduce false positives.
The UBL and FDF implementations both use unlabelled data to forecast anomalies by pre- dicting unexpected changes in feature attributes. Predictions are made by observing a period of known or assumed good states to train primitives in order to recognise an expected set of behaviours. Once this training is complete, observed data is then classified heuristically into one of several states.
UBL’s three classifications for state are determined by calculating the Manhattan distance of a neighbourhood area size using individual neurons. By analysing the differences in neighbourhood area sizeUBLis able to classify the behaviours of individual features as being
in one of three aforementioned categories.
The FDFs classify information into only two states: good or faulty. However, rather than labelling the behaviours of individual features, the entirety of a system’s configuration is given a classification before looking for feature changes using performance tests. Performance tests consist of operational validation criteria – properties that indicate a system is operating within its intended role. If any performance test fails, the entire set of features sampled is classified as faulty. Once a classification is made, the data is then parsed using either the previously mentioned greedy [18] or lazy fashions [19], respectively.
Using three states offers the benefit of proactive analysis over reactive analysis. Although the intention is to establish a solid understanding of accuracy in the detection of faults, doing so proactively has several potential benefits over the reactive solutions presented here. However, it may also be moot to need to address problems proactively under any of the following conditions: the fault detected cannot be prevented, the fault manifests too quickly for a solution to be implemented, or an evolutionary approach is used to prevent further instances of said fault(s). In several instances, the second condition impactedUBL’s results and performance.
Instantiation of classification properties are different between UBL and the FDFs. Both experiments expect the systems to start off in a healthy state for the purposes of initial training. The training then forms the basis of analysis for the FDFs, but forUBLit also is the primary component for classification of sampled data. This is because the initial neighbourhood area size is calculated within this training period and thus where all other data must be inferred. Additionally, as the data is not windowed, this is a static property outside of small, incremental updates that effectively amount to an average of values.
How long the data is stored and how it is ingested plays a critical role in classification. Using a windowed approach for information parsing allows for the avoidance of convergence in training data, and greater adaptivity to changing environmental variables. Both of these properties represent advantages in implementation but they come with a cost. Windowing necessitates more memory and post-processing requirements as purely additive measures are no longer sufficient. As such, the expectation is for windowed information to take longer to classify and process. This is seen readily in the results of both approaches, independently.
The way data is ingested impacts when and how classification occurs. WMIprovides attributes associated with features in a semi-structured, non-uniquely identified tuple. In order to address removals of devices and multiple features that share a similar namespace this exigency must first be addressed. TheFDFs use of a dictionary as a WMIclass unique identifier serves this purpose, but also necessitates much slower parsing then direct observation to vector conversion
4.7. COMPARISON & INFERENCE 91
– as occurs inUBL.
Xen samples metric data from a number of different features. As the values within these samples have multiple ranges, their relative performance and consequent classifications can become difficult. UBL’s solution is to normalise this information to unilaterally use the same evaluation techniques across all features. This reduces the fidelity of the content, but lowers the programmatic overhead needed to classify the sampled data. TheFDFscomparatively have greater fidelity as they do not trim or normalise their respective results.
The classification of data happens at the feature and system levels for UBL and the FDFs, respectively. This distinction impacts other aspects such as frequency of data collection and the number of features and attributes sampled. It also affects how the data is analysed: There is an implied relationship between the number of observations and what predictions, if any, can be made. However, a larger number of observations does not always provide for more accurate results.