PART II: Contextual Modulation Over Time: Serial Dependence in the
CHAPTER 5: MODELLING SERIAL DEPENDENCIES IN VISUAL VARIANCE
2. DETAILED DESCRIPTION
2.3. MODEL COMPUTATIONS
2.3.2. Decision layer
Computation of the posterior distribution
The stimulus presented in a given trial n (Sn) is not directly accessible by the visual
system: only the noisy sensory response that such stimulus has produced in the sensory layer is accessible βi.e. the likelihood probability distribution. As previously stated, this probability represents how likely is the actual neural response to have been produced by each possible stimulus magnitude of the perceptual space.
Within a Bayesian framework, the optimal decision about the stimulus magnitude must combine the noisy sensory information of the current trial with the knowledge of which stimuli were presented in previous iterations. The latter is conveyed by the prior distribution, which represents the probability of encountering different stimuli given the experimental history. This prior distribution is stored in the decision layer and continuously updated with the information provided by each new trial.
The posterior distribution is the probability function resulting of the combination of the likelihood and prior distributions. It represents the probability of each value of the perceptual space to have been the presented stimulus magnitude, given the sensory information and the previous history.
At the beginning of the experiment (trial n=1), in absence of previous information, all stimuli are deemed equally likely a priori; therefore, trial 1 has a flat (uniform) prior defined over the perceptual space. Since the relative probabilities of each perceptual
magnitude are unaffected by the flat prior, the posterior of the first trial is equal to the likelihood:
(20)
πππ π‘m β π(πΛβ’m, πΛβ’m)
All probabilities are normalized so that their sum is always 1, so the combination with a flat prior does not change the value of the function at each value of the perceptual space.
In subsequent trials, however, the prior can be approximated to a Gaussian probability density function computed over the perceptual space:
(21)
Prior β N(π;^FA^, π;^FA^)
We will later see how the prior is computed.
The posterior distribution is obtained from an optimal Bayesian combination of likelihood and prior. The result of this computation is a Gaussian probability density function whose mean is a weighted average of the mean of the likelihood and prior (Petzschner & Glasauer, 2011):
(22)
π;Aβ’N = π;^FA^β π;^FA^+ (1 β π;^FA^) β πΛβ’
where (23) π;^FA^ = πΛβ’2 π;^FA^2 + π Λβ’2 and
(24)
πΛβ’ = 1 β π;^FA^ = π;^FA^2 π;^FA^2 + π
Λβ’2
Thus, the mean of the posterior distribution is a weighted average of the mean of prior and likelihood, wherein each one is weighted by the variance of the other distribution. This means two things:
1. The bias exerted by previous history at the decision layer is always attractive, since the mean of the posterior will be an intermediate value between πΛβ’ and
π;^FA^. In our model, repulsive effects are generated at an earlier stage of processing, by exposure-dependent gain-decrease in the sensory layer.
2. Reliance on previous history depends on the precision of the current sensory signal (a more precise sensory signal implies lower πΛβ’2 and π
;^FA^ ) and of the
previous history (which might be considered memory precision: the more precise, the lower π;^FA^2 and the larger π
;^FA^). In other words, reliance on
previous history will be stronger when the current sensory signal is highly imprecise, or when the memory representation of previous history is highly precise.
The variance of the posterior is also dependent on the variance of prior and likelihood: (25)
π;Aβ’N2 = π;^FA^2 β πΛβ’2
π;^FA^2 + π Λβ’2
The posterior distribution has two roles in the model: providing the basis for response selection and the basis for the new, most updated prior for the next trial. In order to fulfill these two roles, we assume the posterior is, on the one hand, transferred to downstream areas responsible for response preparation and execution, and on the
other hand, stored within the βmemoryβ of the decision layer. However, the posterior does not come out of these processes intact: it is corrupted by two separate sources of Gaussian noise: response noise and memory noise, respectively.
Judgment
The posterior distribution provides the basis for response selection. However, the response (judgment) is not extracted directly from the posterior distribution. As stated in the previous section, the posterior is transferred to downstream areas responsible for response preparation and execution, where it suffers corruption by Gaussian response noise with properties π(0, π"). This βcorruptionβ is mathematically expressed by convolution of two Gaussians: the posterior distribution and the Gaussian response noise. The resulting distribution (henceforth called the response distribution) is a Gaussian probability density function with properties:
(26)
π ππ πG = π(π;Aβ’NG, Β‘π;Aβ’NG2 + π^_β’MAGβ’_2 )
The measure of response noise sresponse is a free parameter in our model.
The modelβs judgment about stimulus Sn is obtained from this response distribution:
specifically, it corresponds to Β΅response = Β΅Postn, i.e. to the value were the response
distribution peaks. Thus, our model always selects the optimal response, the most likely value to correspond to the presented stimulus, after combining the current sensory information and the previous history. Note that the response noise does not produce any bias, since the peak/mean of the response distribution is the same as the peak of the posterior. However, it increases the width of the distribution, with respect to the posterior, and therefore, reduces the difference between the most likely and less likely values, so that non-optimal responses (i.e. different from the peak of the distribution,
such as those that are performed by real subjects) will be more likely under the present conditions than if the response was based directly on the posterior. Thus, response noise does not affect model judgments, but it is relevant for parameter selection through maximum likelihood estimation (see below).
Prior update
As stated before, the posterior distribution is also stored in the decision layer, where it will become the basis of the prior for the next trial, representing the most updated knowledge of the environment given by the summary of all previous stimuli, with greater weight for the most recent ones. However, between the current trial and the next one, this posterior is corrupted by Gaussian memory noise, with the shape
π(0, πK_KA^z), so that the prior of the next trial will be a Gaussian distribution resulting
of convolution of posterior and memory noise:
(27)
πππππGΒ’m = π Β£π;Aβ’NG, Β‘π;Aβ’NG2 + π
K_KA^z2 Β€
smemory is the sixth free parameter in our model. Similar to the case of response noise,
the prior is unbiased with respect to the posterior of the previous trial (π;^FA^GΒ’m = π;Aβ’NG), but because of memory corruption it is less sharp than the posterior.