• No results found

PART II: Contextual Modulation Over Time: Serial Dependence in the

CHAPTER 5: MODELLING SERIAL DEPENDENCIES IN VISUAL VARIANCE

1. MODEL SUMMARY

Figure 16 presents a graphic summary of the model structure.

1 Henceforth we will refer to subject’s or model’s responses as judgments (of the presented stimulus

magnitude in each trial), to disambiguate from population responses of individual neural populations of the sensory layer, and from sensory response (which is the pooled response of all the neural populations of the sensory layer, i.e. the likelihood distribution in one trial). However, we retain the term ‘response noise’ for the noise corrupting the posterior distribution (and thus forming the response distribution) before judgment selection, as it is more standard than ‘judgment noise’. We also employ other established terms like response preparation, response execution, etc, in relation to downstream processes responsible for managing the posterior distribution and producing a judgment.

Figure 16. Model: Basic structure. The model performs perceptual decision-making across two layers: a sensory layer which processes the information from the current stimulus by neural population codes and is subject to exposure- dependent gain reduction, and a decision layer which makes an optimal judgment by a Bayesian combination of the sensory response from the lower layer and the information from the previous stimulus history, which is updated on a trial basis. Thus, negative and positive biases with respect to previous history are generated at different levels of perceptual processing and have different properties and timescales. The position of the free parameters within the model structure is detailed on the figure (see section 2.2 for further information).

1.1. SENSORY LAYER: POPULATION CODES

Negative after-effects have been successfully modelled with population codes (Heron et al., 2012; Jazayeri & Movshon, 2006; Roach et al., 2011). Population codes are a representation of the neural code (that transduces stimulus magnitudes into sensory

responses) in form of a series of neural populations, each sensitive to some preferred stimulus magnitudes. This preference can be modelled by associating each neural population to a tuning function: a probability function expressing how likely is each neuron to produce a response as a function of the stimulus magnitude it is exposed to. Thus, the response of a neuron part of a given population (which may be expressed in terms of firing rate) will depend on the received stimulus and the corresponding value of that neural population’s tuning function, but also on two other factors. The first is the internal noise of the system, which explains its probabilistic nature. The second, and key to the working of our model, is the neural gain, i.e. the scaling factor relating stimulus and intensity of neural response.

The actual sensory response will result of the combination of the responses from all neural populations. Because there is not an unambiguous relationship between stimulus magnitude and neural activity, this sensory response, as its components, is also probabilistic: it may be characterized by a probability function expressing the likelihood of each stimulus magnitude to have generated the actual response. In our model, this is the output of the sensory layer and is called the likelihood distribution, a Bayesian term justified by its role in the Bayesian-like computations that will take place in the decision layer.

By having neural gain subject to exposure-dependent changes, we can use population codes to model history-dependent modulation of current perception. Plus, both the selectivity of neural populations to certain stimulus magnitudes and the gain changes are biologically plausible and have been demonstrated for several perceptual dimensions (Carandini & Heeger, 2013; Dragoi, Sharma, & Sur, 2000; He, Cohen, & Hu, 1998; Kohn, 2007). Specifically, population-code models with exposure-dependent ‘fatigue’ (gain reduction) have been successfully employed to reproduce adaptation processes responsible for negative after-effects (Heron et al., 2012; Jazayeri & Movshon, 2006; Roach et al., 2011).

In order to model a reversal from positive to negative after-effects, a parsimonious approach might establish an exposure-dependent modulation of neural gain that caused first a short enhancement and a subsequent, more prolonged reduction. In this regard, Whitney and colleagues have modelled their serial dependence data on the basis of changes in gain or tuning of neural population codes (Fischer & Whitney, 2014). However, as stated above, our research suggests that both effects have different properties and purported origins: the positive bias likely arises from decisional rather than sensory processes, and the negative bias seems to appear almost as early as the positive effect but last for much longer, only becoming evident when the competing effect declines. For these reasons, in our model the sensory layer generates only negative after-effects, while the positive serial dependence arises from the decision layer.

1.2. DECISION LAYER: BAYESIAN-LIKE KALMAN FILTER

In our model, positive serial dependence arises from a Bayesian-like combination of the current noisy sensory response (the likelihood distribution, output of the sensory layer) and a prior probability of encountering certain stimulus magnitudes, given by trial history.

Thus, the prior distribution acts as a sort of summary memory representation, responsible for the attractive bias toward previous sensory history. As new information is received in each model iteration (i.e. each trial), this memory needs to be updated. Besides, it seems reasonable that recent sensory input should be given more weight in constructing the prior for a certain trial, due to memory limitations and the need for a balance between perceptual stability and adaptation to environmental changes. This is consistent with the attractive bias exerted by the recent trials in our data.

In order to achieve an optimal integration of information, Kalman filters provide an algorithm for recursive, Markovian update of the prior probability when both this and the likelihood are Gaussian functions, and have been used efficiently for decoding neural activity in the motor cortex (W. Wu, Gao, Bienenstock, Donoghue, & Black, 2006) or explaining other biases in perception, such as iterative estimations of displacement (Petzschner & Glasauer, 2011) and temporal regularities (Luca & Rhodes, 2016). In the first iteration of the model, this prior is a uniform probability distribution, as the absence of previous sensory inputs, and therefore of previous knowledge about the statistics of the environment, renders all magnitudes equally likely a priori. In following iterations, the posterior distribution generated in the previous trial (by combination of that trial’s prior and likelihood distributions) becomes the prior for the current trial after corruption by memory noise.

The likelihood distribution produced in the sensory layer is forwarded to the decision layer and combined with the prior in order to form the posterior distribution. The simplest case would be to assume an optimal, purely Bayesian combination, i.e. the product of both Gaussians:

(1)

𝑃6𝑆 𝑅9 : =;6" <9 :∗;(<)

;(")

-where R is the sensory response generated by the stimulus S; assuming all neural responses are equally likely a priori and therefore discarding P(R).

The resulting posterior distribution, evaluated at each stimulus magnitude within the perceptual space, represents the likelihood of each stimulus magnitude to have started the current iteration of the model, according to a Bayesian framework that takes into account previous knowledge of the environment. We assume that the model behaves

optimally, so that the selected judgment is the peak (i.e. the mean) of the posterior of the trial. However, the judgment is not selected directly from the posterior, but from the response distribution, which results from the posterior being transferred to areas responsible for response execution and corrupted by response noise. The response distribution, evaluated at each magnitude of the perceptual space, represents the likelihood of reporting that magnitude as judgment in response to the received stimulus. Response noise does not cause any bias in the response distribution, which has the same mean/peak as the posterior distribution but a larger variance -see below for detail. As stated before, this posterior becomes the prior for next trial after corruption by memory noise.