5.2 A Feedforward Model of Visual Processing
5.2.5 Results: Morphed Faces
A further investigation of the human MTL responses currently in progress involves presenting the patient with “morphed” images of familiar people, that is, images that
are created by blending images of two different people (A. Kraskov, personal commu-
nication). This experiment serves two purposes: to investigate how the response of a neuron changes as an image is continuously transformed from a person the neuron responds strongly to into some other person (and back), and to see how the neu- ron’s activity correlates with the subject’s perception of the image’s identity. The first question can also be investigated within this computational framework, with the
1.635 1.283 1.277 1.196 1.095 1.064 1.033 0.352 0.208 0.133 0.095 0.061 0.056 0.000 0.000 0.000 0.000 0.000 0.000 0.000 (a) 0 0.5 1 1.5 2 0 5 10 15 20 v i number of responses 0 0.5 1 0 0.5 1 p(false alarm) p(detection) ROC, area = 0.950 (b) 1.529 1.235 1.098 0.379 0.354 0.267 0.164 0.131 0.113 0.067 0.029 0.012 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 (c) 0 0.5 1 1.5 2 0 5 10 15 20 25 v i number of responses 0 0.5 1 0 0.5 1 p(false alarm) p(detection) ROC, area = 0.943 (d)
Figure 5.7: Responses of two selective units (out of 15) after the unsupervised category learning. (a, c): images that evoked the top responses, with the activation level above
each image. Every 2nd image omitted for clarity. (b, d): response histograms. x-axis
is the activation level;y-axis is the number of test images (40 total) evoking a response
at that level. Responses to preferred person—Halle Berry in (b), Jennifer Aniston in (d)—in black; responses to all other images in white. Insets: ROC curves. Solid line is ROC curve for selected unit, dashed line is ROC curve for best principal component.
additional advantage that we can look at the responses of neurons that ordinarily represent both ends of the morph (while in the human studies the investigators gen- erally only have access to a neuron representing one of the endpoints due to the small number of simultaneously recorded selective neurons in any one session).
To explore the question of how the model responds to morphed images, I picked two individuals with some similarities in appearance from the same training session. For this example I used the same trained network for which I presented results in Section 5.2.4 above, for which there were 10 different individuals in the input set. Both individuals used for morphing were very well represented by the trained network, with a unit providing 100% ROC accuracy and well separated in-category and out-category responses for each. I generated 9 morphed images between each of 5 different pairs of images using the commercially available photo morphing software “Morpheus”
(available at http://www.morpheussoftware.net/). To ensure a smooth morph
between the two images I manually matched keypoints such as eyes, ears, and mouths in the two images, so the resulting morph was a combination of distortion and grey- level interpolation between the starting and ending images. I then computed the response of the trained network to the morphed images. There is no effect of hysteresis
in these results, as the state of the network (initial condition of v) was reset for each
image presentation.
Figure 5.8 summarizes the results for all 5 morphings and gives an example mor- phing. Response strength to each morphed image is shown for the two neurons representing the two indivduals. Each curve is the response of one neuron to one set of morphed images; the curves are individually normalized by the strength of the neuron’s response to the unmorphed image of its preferred person. As expected, the response curves are sigmoidal, with a sharp transition between on and off responses as some threshold is crossed. This sigmoidal transition is a feature of the sparse coding network and is due to a combination of the bimodal prior and the winner-take-all
0 10 20 30 40 50 60 70 80 90 100 0 0.2 0.4 0.6 0.8 1 Percent Morphed response
Figure 5.8: Responses of trained network to 5 different morphings between the same two individuals (top) and an example morphing (bottom). Solid lines are the re- sponses of the neuron that prefers the person on the left; dashed lines are the responses of the neuron that prefers the person on the right. All responses are normalized by the response to the unmorphed preferred image.
like network topology; it is much different from the gradual transition that would be expected from linear filters. In a distributed population code in which individual neu- rons responded to, for example, different types of facial features, individual neurons may still switch on or off in the same sigmoidal fashion as their preferred features became more or less clear, though just as plausibly their activity could vary smoothly if they functioned as linear feature templates.
Much like in the human data from both electrophysiology and psychophysics, different morphings result in different transition thresholds, reflecting the difference between the qualitative similarity of a morphed image to one individual or the other
and the distance along the continuum of morphings (Kraskov, personal communi-
cation). The average point at which the response was 50% of the response to the
unmorphed preferred image was 18.6% morphed (σ = 4.6) for the neuron that pre-
preferred the individual on the right. Hence in most cases there is a range of mor- phings that produce only weak responses in both neurons—the network essentially decides that the image resembles neither individual. Because (as noted above) in the human experiments neurons representing both endpoints are only rarely available it is very difficult at this time to compare this particular aspect of the responses to real data, but this suggests one interesting question that could be asked if it is ever possible to perform the morphing experiment between images of two people that are represented by two different recorded neurons: is one or the other of two such neu- rons always active, or, like in the model, is there some range of morphed images that elicit no strong response? Further, how does this activity correlate with the subject’s identification of the image as being one person or the other (or neither)?