Lesioning tests: long-term memory analysis

4.3 Analysis of model performance

4.3.2 Lesioning tests: long-term memory analysis

Before applying the lesioning tests, the first step is to analyse the weight matrices between the different layers. The weights are represented in Figure 4.7 as rectangles of variable size and colour. The size is proportional to the weight value (bigger rectangle, bigger weight), and the colour represents the signal of the weight - red for negative, green for positive. All three learned weight matrices in the model are represented (note that weights from hidden to memory layer are kept constant).

H1 H2 H3 H4 H5 M1 M2 M3 M4 M5 Hidden units Memory units H1 H2 H3 H4 H5 T Tx D P S Pv Hidden units Input units H1 H2 H3 H4 H5 A V Hidden units Output units

Figure 4.7: Neural network weight connection matrices: memory to hidden layers (top), input to hidden layers (middle) and hidden to output layers (bottom). The weights are represented as rectangles of variable size and colour: the size is proportional to the weight value and the colour represents the signal of the weight (red for negative and green for positive).

The network contains a diverse range of different weights magnitudes with both inhibitory and excitatory connections. This suggests that the network can distinguish inputs with small differences. The same is also true for their temporal structure, which indicates the important role of the temporal structure of the features extracted from the inputs (the compressed representation of past states)

to predict the outputs (the standard deviation of the weights from the input to the hidden layer is 1.30, and from the memory layer to the hidden layer is 1.90). There is also a tendency for the memory units to reinforce their respective hidden units (with the exception of H4), and to inhibit other hidden dimensions. Some more information can be obtained from the lesioning tests.

The lesioning tests consist of systematic “damaging” of selected connection weights in the model. The variations of the output error (e.g increase) gives an idea of how important this connection is to the model output dynamics. The lesioning procedure applied here involves lesioning each of the connections of all 3 learned weight matrices. In order to simplify the visualisation of the lesioning analysis7_{, Figure 4.8 shows the effect of removing each of the weights from the} network on the output predictions. If the removed connection had little effect on the output performance (rms <0.090), it is represented in black. This means low

rmserror, similar to the post training error (unlesioned network). Higher values of error are represented in gray (0.090< rms < 0.300) and white (rms >= 0.300).

The lesioning tests shows two distinct groups of hidden units with stronger effects on the output: H1 (rms = 0.231), H2 (rms = 0.225) and H3 (rms =

0.181) are related to arousal, while H2 (rms = 0.111), H4 (rms = 0.125) and H5(rms= 0.159) are related to valence. Since the hidden units are the only units directly connected to the output, these relationships are mirrored in the weight connections between these two layers.

All the sound features have relevant information for the model: for all inputs, at least two damaged connections to the hidden layer decreased substantially the model performance (for arousal and valence). H4 is the only hidden unit which

blocks the input information. For both outputs, the removal of each connection from the input layer to this unit had a small effect on the output when removed (all 7_{The complete lesioning procedure involved 60 tests. As it would be difficult to analyse all this}

charts separately, the following diagram aims to integrate all the results into a single representation scheme.

Figure 4.8: Model weight matrices analysis: each learned weight in the model was removed (value set to 0.0, one at a time), and the model performance was then measured using thermserror. Each cell in the above matrices corresponds to the removal of one connection linking two processing units, and the values indicated in each cell correspond to the rms error. For easier reading, the rms

errors are represented using a colour code: black for those weights that had small or no effect on the model performance (rms < 0.09); for higher errors grey (0.09< rms < 0.30) and white were used (rms >= 0.30).

rms were lower than 0.063). Unlike the lesioning tests on the hidden to output units, the input to the hidden layer connections are not the only units affecting the hidden layer activity. The compact representation of the past states of the network is sent to the hidden layer from the memory layer. These are the connections that “decode” the temporal structure of the inputs.

to the prediction of arousal and valence. The first main observation is that the arousal output had a stronger attachment with the memory layer: 17 out of 25 of the weights in this matrix had high errors when removed. Valence had only 10 relevant connections. M4 is the only memory unit with very small interference in the hidden layer activity (the effect of removing each connection had very little effect on the modem performance; the rms was always lower than 0.054) and the output predictions. BecauseH4 is isolated from the inputs and its past state

(M4) is discarded by the model, its activity is only affected by the past states of

hidden units 1, 2 and 3 (which as seen are related to the arousal output). The recombination of past states in H4 has nonetheless a relevant connection to the

valence output. Because the activity inH4is related to the arousal context (H1,H2

andH3), the valence predictions have commonalities with the arousal predictions.

4.3.3 Input/output transformation: model production rules

In document Computational and Psycho-Physiological Investigations of Musical Emotions (Page 102-106)