• No results found

4 Artificial Neural Networks

5.2 Implementation Overview

5.2.3 Artificial Neural Network Component

The neural network components were designed to be plugged on to the end of the MCRDR component using an integration technique described in section 5.2.4. The ANNs used are based on the backpropagation (4.3) and RBF (4.4) techniques. It should be noted that these methods are both traditionally supervised neural networks. However, they have also been used in unsupervised environments such as Tesauro’s TD-Gammon (1994). In these environments a separate reinforcement learning method is used to act as the supervisor. In this thesis the networks are also being used in a similarly unsupervised way (except there is no temporal separation). Specifically training is supplied either using the classification of the case or by a calculated class or rating. The specifics of the training rewards used in the various experiments are described fully in section 6.1.

There were seven methods developed and described in detail starting in section 5.3. Each of these methods was designed to test a particular area of the tasks required by the system. Some methods were not expected to work very effectively, but were developed to show that the more advanced methods were required to capture the hidden and dynamic contexts. The primary issue in this project, however, is how to use a neural network in an environment with an ever changing input space. The following section will define this problem mathematically, describing exactly what output can be generated from the MCRDR inferencing process and what is required of the network.

5.2.3.1 The Problem

MCRDR produces an output that can be interpreted in a number of different ways. As the inference process is performed a number of rules fire down the tree and through particular branches. Keeping track of the rules that have fired is one form of output which is highly descriptive of the KB structure. A subset of this is to just consider the final nodes in the path, or terminating rules. There is also information within the rules’ nodes themselves. For instance, each terminating rule contains an identified classification or action, which is the usual output from the MCRDR inferencing process. However, frequently many terminating rules may have the same classification, effectively meaning that this is a reduction in the useful information from the MCRDR tree. Lastly, each rule also contains a number of attributes that apply to that particular rule. These potentially indicate important global information; however, only using attribute information removes some of the contextual based knowledge in the KB.

Expressing these outputs mathematically, it can be seen that the output from the MCRDR methodology is essentially a set of rules fired, denoted R, where

( )

R*

R∈℘ , and R* is the set of all the rules currently in the knowledge base. Furthermore, the final terminating rules are those that inferencing could not continue past due to them either being a leaf node or because none of their

children fired. This output can be given by RT ∈℘

( )

R* , where RT is the set of terminating rules and RT∈℘

( )

R . Using the more common classification, denoted C, based view, thenC∈℘

( )

C* , and C* is the set of all possible

of fired rules could be expressed A∈℘

( )

A* , where A* is the set of all possible attributes currently identified within the MCRDR KB.

Clearly, regardless of which method or combination of methods is used for gathering MCRDR’s output, the fundamental mathematical nature of that output is the same. Therefore, for simplicity’s sake this is reduced to a common notation. MCRDR can be said to produce a set of features, denoted F, as outputs,

where F∈℘

( )

F* , F* is the set of all the features currently in the knowledge base, and F*=R*∪C*∪A*.

Given the above input to the ANN method, the output is a set of values, v, which provides one or more varying results in applications where dissimilar tasks may need to be rated differently. For instance, v0, may identify the

desirability or importance of the case presented. Therefore, a mapping must be

found from the set F→v,∀F∈℘

( )

F* . Additionally, RM should be able to learn this mapping for both linear and non-linear sets of features quickly and be able to generalise effectively. Thus, RM needs to identify patterns of features and then associate a value for each pattern through the use of a function-fitting algorithm.

The neural network was integrated into MCRDR by linking individual features used to an input neuron. Thus, for each feature found by the MCRDR system, an associated neuron will fire. The obvious problem with this, as mentioned during the introduction, is that the input space is constantly growing. Every time the expert notices a deficiency in the KB they add a new rule, potentially with attributes never seen before in the conditions and occasionally forming a new conclusion. Therefore, the output feature space, F*, of MCRDR

can grow towards, F#, where F# ≥ F* and F# is finite but unknown.

To overcome this problem a method for adding input nodes was required. This cannot be easily achieved without significantly influencing the already semi-trained network. As will be detailed later in this chapter, through careful redesign of the ANN algorithms used and selection of initial weights for the new nodes certain effects can be removed or minimised. These, however are unique for each technique developed.

One interesting feature of this problem is that if the expert has just added a new rule it may be possible to assume that they are reasonably sure of the class or value for that case. This is due to having just spent the time reviewing it themselves. Therefore, it may be possible to automatically ensure the network is adjusted accurately for the newly added rule and neuron. This would need to be done by directly calculating a plausible value for any new components added to the network. Once again this calculation is unique for each method developed.