Signalling changes - Basic Vision - An introduction to visual perception.pdf

Micrograph of a slice through the monkey retina, showing the fovea. Light entering the eye would arrive from the top of the picture. The long thin photoreceptors (the rods and cones) can be seen in the light-coloured layer near the bottom. Three distinct bands of cell bodies can be seen. The dark layer just above the cones, the outer nuclear layer, contains the cell bodies of the cones.

The next dark layer, the inner nuclear layer, contains the cell bodies of the horizontal, bipolar, and amacrine cells. The top dark layer contains the ganglion cells. Note that in the central fovea the cones are close to the surface of the retina as the layers of cells above them are pulled aside to form the foveal pit.

Retinal ganglion cells

Inner nuclear layer

Outer nuclear layer

Photoreceptors

CHAPTER OVERVIEW

From the brightest of sunny days to the dimmest of starless nights, our visual systems need to tell us what is out there if we are to find our food and avoid been eaten. How does the visual system cope with this massive range of conditions? How can it turn the billions of photons of light that enter the eye into an efficient code that can tell us quickly and accurately about the world around us? In this chapter we shall see that our early visual system actually throws away lots of information about the scene, leaving us only with information about the changes, or edges, that occur in the image. It does this by adopting a strategy with which students will be very familiar—it tries to be as lazy as possible until it really has to act. One consequence of this is that anything that remains the same over time can disappear completely from our perception. However, what may at first sight seem like a perverse and profligate strategy actually creates a code that is efficient and easy to use.

Introduction

As we learnt in the previous chapter, the retina contains an array of photoreceptor cells called cones and rods that transform the light energy into electrical activity. This electrical activity then passes through a network of cells until it reaches the retinal gan-glion cells (see Figure 2.1 to remind yourself what the retina looks like). The gangan-glion cells in turn send their signals out along the optic nerve for processing in other parts of the brain. We shall look at how these retinal ganglion cells group together the output from many photoreceptor cells into receptive fields, and how such an operation allows the most important information to be transmitted onward to the brain. The key to understanding this aspect of retinal function lies in working out exactly why these receptive fields are good at transmitting useful information. It turns out that the trans-mitted information tells the brain about changes in the pattern of information reach-ing the eye—these changes are either in space, such as a border between a bright region and a dark one, or in time, such as a sudden increase or decrease in light intensity.

Rod Cone

Figure 2.1 A simple diagrammatic representation of the retina.

A problem

Let’s think about what we are trying to do with our visual system. In Chapter 0, we introduced the idea that vision is an active system—its job is to find the important information (ripe fruit, cars on a collision course with us, angry faces, ...), and to disre-gard the things we do not need. We also said that all we have to do this with is the huge number of photons (minute balls of light) that happen to enter our pupils. So what problems face us in this quest—and how can we (and many other animals) solve them?

There is a lot of information in a retinal image (see Box 2.1). The simplest way to transmit this huge data load to the brain might be to connect each photoreceptor in the retina (all 130 million of them) to its ‘own’ ganglion cell—remember, it is the axons of the ganglion cells that form the optic nerve—and to transmit all this information to

➾See Chapter 0

➾

How much information is in a scene?

How much information is there in the average scene? We can get a very rough answer by drawing an analogy with the way computers store images. We have about 120 million rods and 8 million cones in each eye; that’s around 130 million photoreceptor cells (give or take a few million). Each point can represent many different shades of light (or shades of grey) by how much they fire—

forget colour for now. How many? If you have 200–300 shades of grey then the picture looks much the same as one which has millions of shades of grey—so 200–300 shades is probably all the visual system needs. It happens that 256 shades of grey can be encoded in 8 bits of informa-tion (that’s 2⁸ for the mathematically minded). Now, 8 bits make up 1 byte. So, a picture with 130 million points, where each point has 1 byte, has 130 million bytes of information. One million bytes of information is very nearly 1 megabyte (Mb). Now, let’s suppose that you can see flicker up to 30 flashes per second. Therefore, it might follow that the eye encodes roughly 30 images per second. This gives 130 × 30 = 3900 Mb per second per eye. That’s about six compact disks full every second! On this completely simplistic calculation, the information flow from two eyes would fill up a 120 Gb hard drive on a PC in around 15 seconds.

OK—serious problem, but perhaps not quite that serious. We also know that the information leaving the eye does so in the 1 million nerve fibres of the optic nerve. So, a good deal of image compression, of the sort described in this chapter, will have occurred before the information leaves the retina. However, even with only 1 million ganglion cells (instead of 130 million recep-tors) to contend with, you would still have filled your hard disk in 30 minutes. Of course, putting in colour information as well can only make matters worse—there’s still more information!

Even allowing for the dodgy assumptions in the above calculation, the potential amount of information in visual scenes is mind-boggling, both for a computer and for a brain. Ways must be found to reduce the amount of stored information in such a way that the important stuff is still there but the irrelevant stuff is not. Getting that balance right is what makes the visual systems of biological creatures so very clever. Computer and camera technology often aims to mimic the action of human vision. A digital picture is usually stored in compressed format, for example a JPG

Box 2.1

A problem 47

the visual cortex. It turns out that this would require far too many neurons and the cable formed by these 130 million nerve fibres would take up nearly all the eye, leav-ing us with a massive blind spot (or a small sight spot, if you like). This clearly defeats the point of having an eye in the first place. The purpose of the visual system is not to get a little picture of the world into the brain—there is a perfectly good world out there, so we don’t need a picture of it in our heads. What we need to do is somehow to

➾

image from a digital camera uses much less memory than an uncompressed TIF image. The com-pression method tries to reduce information in such a way that the discarded information is not visible to a human observer, unless you examine the picture up close with a magnifying glass (Figure 2.1.1). Therefore, the compression algorithms need to know about how human vision works. The same kind of thing is true (except that the problems become even harder to solve) for moving images, such as digital TV transmission.

Figure 2.1.1 These two pictures seem identical. The picture on the left is a ‘jpeg’ image and takes up 240 kb of memory, while the ‘tiff’ uncompressed image on the right occupies a massive 14 Mb. Clearly, the information that the jpeg throws away is information that our visual system doesn’t need. The lower image shows a close-up of the images, and now we can see the distortions introduced by the jpeg compression.

Box continues . . .

get rid of things that are not important and only signal the things that are important, but this just changes the problem—now we shall have to decide what information is important.

Retinal ganglion cells and receptive fields

How do we know what information is important to the visual system? The simplest way is to ask the ganglion cells. By performing physiological measurements on them, as they are the final stage of retinal processing, we can find out what they actually do.

This is done by putting an electrode near one ganglion cell in the retina of an animal and waving a small spot of light around in front of the eye. (Don’t feel tempted to do this at home, even if you have a small torch and a pet.) The electrode will pick up the small electrical signals associated with each action potential, and if you amplify this signal and send it to your hi-fi you can listen to the activity (each action potential gives us a little click), or if you send the signal to a TV you can count the number of action potentials (see Chapter 12 for more on single-cell recording). Some cells fire over 100 times a second, so you either have to be a very fast counter or get a computer to help.

The first thing we find as we listen to the cell is that even when there is no spot of light from our torch, we will hear a (fairly low) number of clicks per second—that’s the neuron’s baseline activity level. It turns out that these cells (and many others in the brain) fire off action potentials spontaneously, even when they are not directly stimu-lated. You now turn on the spot of light from your torch, and lo and behold nothing happens—you still hear infrequent clicks.

Sometimes, if you are very lucky and the spot of light just happens to be shining on a very particular small part of the retina, you will hear the clicks (action potentials) either increase or decrease in frequency. You may now be tempted to explore just those parts of the visual world in which you can make the cell either increase or decrease its firing. What you will find is that there is only a tiny region of the field where you can do this—this region is what we call the cell’s receptive field (Lennie, 2003). This explains why we had to be very lucky to get the cell to do anything inter-esting. The cell only cares about changes in one little bit of the world—it ignores the rest. The idea of a receptive field is so important in understanding perception that it is worth defining it properly. The receptive field of a cell is the area on the retina over which the behaviour of that cell can be directly influenced. Remember that, because it’s probably the most important thing you’ll learn today.

Let’s now go back and look more closely at our receptive field. If you hold the light in enough locations and listen to the clicks from the neuron, you find that the receptive field contains two kinds of region—one in which the light increases the click rate (we call this an ON region), and another in which the light decreases the click rate (an OFF region). This seems a bit weird. What’s the point of having a system in which a light both increases and decreases the firing rate of a neuron? Wouldn’t it just be simpler if

➾See Chapter 12 Retinal ganglion cells and receptive fields 49

the neuron just fired more when the light was in its receptive field? Yes, but not as use-ful, as we shall soon see, but first let’s look a bit more at the way these receptive fields are organized.

In 1953, Kuffler performed an experiment, similar to our example above, on the ganglion cells of a cat (Kuffler, 1953). The results of his experiment, and many similar ones, suggested that the ganglion cells have receptive fields similar to those shown in Figure 2.2. The typical receptive field is roughly circular, with a small central region and a larger surround, rather like a fried egg. There are approximately equal numbers of two types of this receptive field, called ‘ON-centre’ and ‘OFF-centre’. An ON-centre unit is one whose firing rate increases when light hits its centre, and decreases when light hits the surround. An ^OFF-centre unit does this the other way round. For all recep-tive fields, stimulation with light somewhere else in the visual field (outside the outer boundary) has no effect on the firing rate. If we were to take a cross-section through these receptive fields and plot how much each cell would fire as we change the position of our little spot of light (as we have at the bottom of Figure 2.2), we see that the ON-centre cell has what is called a ‘Mexican hat’ shape, because if you can imagine this in 3-D you will indeed have something approaching the shape of a Mexican hat (minus the tassels and embroidery). Light falling on the very centre increases the cell’s firing the most (the technical term is maximum excitation), and light falling on the centre of the surround region causes the largest decrease in firing (maximum inhibition). The profile for the OFF-centre cell is that of an upside-down Mexican hat, or whatever it is that looks like an upside-down Mexican hat.

So, the physiology tells us that retinal ganglion cells:

• have small receptive fields

• have receptive fields that have a centre and a surround that are in competition with each other

Figure 2.2 Representations of ON- and OFF-centre receptive fields. The ‘fried eggs’ in the upper part of the figure show the maps of areas of excitation and inhibition in an ON-centre cell (left) and an OFF-centre cell (right). The traces below show the response of each cell to a spot of light presented at various positions across the centre of each cell. In 3-D, we can think of the receptive field as looking like a Mexican hat, pictured here for anyone who hasn’t seen one before.

• come in two varieties—ON-centre and OFF-centre (see Box 2.2), but this doesn’t tell us why we have these. To understand this, we need to consider how such cells respond to some simple patterns.

Receptive fields and image processing

Look at the ON-centre cell illustrated in Figure 2.3. When no light is shone on it the cell fires a few times per second. Now, we turn on a bright light that just illuminates the centre and the firing rate jumps up to many impulses per second—you can count these in Figure 2.3b if you’ve nothing better to do. We can say that the centre has an excitatory input. If we illuminate only the surround, the firing rate drops, abolishing even the baseline activity (Figure 2.3c). We can say that the surround has an inhibitory input—in other words, stimulating the centre increases firing and stimulating the

➾

ON-centre and OFF-centre cells

It’s not obvious why we should have both ON- and OFF-centre cells. At the level of the photorecep-tors we don’t have separate recepphotorecep-tors for increases and decreases in light level—each cell signals by increasing or decreasing its signal (Dolan and Schiller, 1989). The reason why we have sepa-rate ON- and OFF-centre ganglion cells seems to be that the retinal ganglion cells have only a low baseline rate of firing when there is no stimulus. Now, if the cell is not firing much to begin with, it’s very easy to spot when the cell begins to fire a lot; however, it’s very hard to spot when the cell fires even less. Therefore, we would find it very difficult to see changes that went from light to dark with such a system. To be sensitive to such a stimulus, and to perceive it rapidly, we need a system that increases its firing to such a luminance decrement (as well as one for increments of course).

How can we show this? It turns out that the two pathways (ON and OFF) use different chemicals (neurotransmitters). Moreover, a particular chemical (2-amino-4-phosphonobutyrate (APB)) can selectively block the ON-bipolar cells so that they can no longer fire (remember that the bipolar cells lie between the photoreceptors and the ganglion cells (Figure 2.1)). We can now see what happens to vision if all the ON-ganglion cells’ responses to the stimuli are blocked, while all of the OFF-cells’ responses are unaffected. This is more interesting than it might initially seem. What we might have predicted is that the response from the centre of the ON-cell would be blocked, but the surround would be OK—whereas for the OFF-cell the centre might have been OK, but the sur-round might be lost. Not so. At the behavioural level we find that the animal is no longer able to detect targets such as the dark spots seen in Figure 2.2.1, but can still spot the light. So we need both systems to see the world as we normally do.

The above story holds true for daytime conditions where we are using our cones. However, under night-time conditions things change. If we administer APB (which, remember, blocks the ON-cells only) the animal becomes completely blind (temporarily, you will be pleased to hear). This

Box 2.2

Receptive fields and image processing 51

Stimulus

ON-centre cells

Response

(a) (b) (c) (d)

Figure 2.3 The response of an ON-centre ganglion cell to various stimuli. Spots of light larger than the receptive field have no effect. The best response comes from a small spot that just covers the excitatory centre.

➾

tells us that only the ON-cells are normally active under rod vision. Therefore, if you were to sit in a dark room for about 30 minutes (see Box 2.4) while looking at Figure 2.2.1, we can make two predictions: (1) you won’t be able to see the dark blobs; (2) you spend too much time on your own.

Figure 2.2.1 Light and dark spots. After sitting in the dark for half an hour you won’t be able to see the dark spots.

➾See Box 2.4

Box continues . . .

surround decreases firing. So what happens if we stimulate both centre and surround simultaneously with a large spot of light? Now, both the excitatory input from the centre and the inhibitory input from the surround will be present. Together, these inputs tend to cancel out, and there is little or no change in the response of the cell (Figure 2.3d).

So, the overall result is that the cell doesn’t change its response to large things (‘large’

here is defined as anything bigger than the cell’s receptive field). In order to get a response from this cell, we need to have a change in the light occurring within the receptive field.

Now let’s take the story one step further. In Figure 2.4a we see an edge falling across an ON-centre cell. How will it respond? Let’s think this one through carefully. Imagine that we have a cell that fires around 12 impulses per second in the absence of any

In document Basic Vision - An introduction to visual perception.pdf (Page 66-89)