Statistics of the Response Distribution - Explicit object representation by sparse neural codes

In this section I discuss the statistics of the responses obtained from the sparse coding network after training. As an example I use the responses from the face discrimination network discussed in Section 5.2.4 (which was trained on 10 images of each of 10 different individuals); these results are typical of those obtained from other runs. For comparison I will look at two cases that strip key features from the sparse coding network. First, I cut the feedback connections but leave the dynamics of the individual neurons intact, to the network dynamics become

v =GTu+λS′

(v). (5.2)

This will illuminate the role feedback plays in recognition performance and sparsening of responses, and provide a prediction of how recognition would suffer in the event that feedback connections were cut in the real biological system. Second, I simply

treat the trainedGmatrix as a feed forward linear filter, that is, I set v =GT_u_{. This}

shows how similar each input u is to each learned basis function in the absence of

the feedback inhibition that produces winner-take-all like behavior in the network. This linear feedforward network will allow us to see how much of the sparseness of the responses is due to the form of the learned basis functions and how much is due to the sparsening nature of the dynamics.

Both the dynamic and linear feedforward networks still perform well according to our classification metrics, with an average ROC accuracy of 89% in both cases (compared to 91% for the feedback network). However, a more detailed look at the responses reveals that true recognition performance would likely suffer somewhat more that the optimal ROC result suggests. The purely linear feedforward model lacks the bimodal response distribution that cleanly separates “on” responses from “off” responses and makes readout particularly easy. The response distribution of

the dynamic feedforward network is still bimodal, but while the largest responses of an individual neuron tend to be to its preferred person, many significant responses are to other people due to the lack of inhibitory feedback from other neurons in the network. Hence our model predicts that, if feedback connections in the visual path- way were somehow cut, recognition performance would suffer but not be eliminated entirely—instead we would expect increased confusion between similar people or ob- jects. Feedback is crucial for learning, however—we would expect a person with such an injury to be unable to learn to recognize new people or categories.

Figure 5.13(a) is a histogram of the strength of all responses to all images in the testing data set (100 images times 15 neurons for 1500 total responses). The response distribution is bimodal, as specified by the sparse prior, with most responses near

zero. The “large” responses are centered around roughly 1.25, somewhat larger than

the second peak location of 1 in the prior as the inputs bias all responses to be larger than the unstimulated equilibrium points of 0 and 1. The kurtosis excess of this

distribution is 8.7, reflecting its sparse and bimodal nature. The responses of the

dynamic feedforward network, depicted in Figure 5.13(b), are still bimodal, and are in general larger due to the lack of inhibitory feedback. These responses are still

sparse, with a kurtosis excess of 6.6. The responses of the feedforward network are

unimodal and widely varied, but due to the nature of the sparse basis functions are

still sparse (but less so), with a kurtosis excess of 3.5. Figure 5.13(c) is the same

histogram for the feedforward network; the distribution is clearly unimodal.

One often-suggested role for sparseness is the reduction of redundancy by decorrelating neuronal responses (Vinje & Gallant, 2000). Figure 5.14(a) is a histogram of the correlation coefficient between all neuron pairs (15 choose 2, or 105 pairs). Most correlation coefficients are negative, reflecting the inhibitory effect neurons have on one another. Overall correlations are weak, with a mean absolute value of the corre-

0 0.5 1 1.5 2 2.5 3 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 v i fraction of responses (a) 0 2 4 6 8 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 v i fraction of responses (b) −10000 −500 0 500 1000 1500 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 v i fraction of responses (c)

Figure 5.13: Histogram of the strength of all responses to all images in the testing data set (1500 responses total). (a): feedback network depicted in Figure 5.3. (b): the same network with the feedback connections cut. (c): linear feedforward network

−1 −0.5 0 0.5 1 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 correlation coefficient

fraction of unit pairs

(a) −10 −0.5 0 0.5 1 0.05 0.1 0.15 0.2 0.25 0.3 0.35 correlation coefficient

fraction of unit pairs

(b) −10 −0.5 0 0.5 1 0.05 0.1 0.15 0.2 0.25 0.3 0.35 correlation coefficient

fraction of unit pairs

(c)

Figure 5.14: Histogram of the correlation coefficient between all neuron pairs (105 neuron pairs total). (a): feedback network depicted in Figure 5.3. (b): the same network with the feedback connections cut. (c): linear feedforward network with the

same G matrix.

the feedback connections cut; correlations in this setting are somewhat higher, with

an mean absolute value of 0.28. Finally, Figure 5.14(c) is the same histogram for the

linear feedforward network. Neuronal responses are more strongly correlated in this

case, with a mean absolute value of the correlation coefficient of 0.37. From this we

see that both the dynamics induced by the sparse prior and the recurrent feedback play a role in decorrelating neural responses. Note that we are not considering tem- poral correlations here (as our network considers each image separately rather than an image sequence), but correlation in the amplitude of neural responses.

work serves to both enhance the sparseness of the responses (through the sparse prior distribution encoded in each neuron’s dynamics) and to reduce the correlation between the responses of different neurons. This decorrelation reduces the redundancy of information carried in the firing rates of different neurons.

In document Explicit object representation by sparse neural codes (Page 112-116)