PARAMETER SELECTION - DETAILED DESCRIPTION

PART II: Contextual Modulation Over Time: Serial Dependence in the

CHAPTER 5: MODELLING SERIAL DEPENDENCIES IN VISUAL VARIANCE

2. DETAILED DESCRIPTION

2.4. PARAMETER SELECTION

The values for the six free parameters (ampgain, σgain, σrecovery, σtuning, σmemory, σresponse) are

selected based on which combination of values has maximum likelihood given the model and the actual data.

As previously indicated, the likelihood of a judgment Jn produced by a participant in a

given trial, given the model and a specific set of parameter values, corresponds to the value that the response probability density function (defined along the perceptual space) takes at Jn. In turn, the likelihood of the whole sequence of judgments provided

in the experiment (i.e. the likelihood of the entire dataset) will be the product of the likelihoods of the individual trials:

(28)

𝑃(𝑑𝑎𝑡𝑎/𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟𝑠) = 𝑃(𝐽_m⋂ 𝐽₂ … ⋂ 𝐽_G) = 𝑃(𝐽_m) ∗ 𝑃 (𝐽₂/𝐽_m) … ∗ 𝑃(𝐽_G/ 𝐽_m⋂ 𝐽2… ⋂ 𝐽G/m) = 𝑅𝑒𝑠𝑝m(𝐽m) ∗ 𝑅𝑒𝑠𝑝2(𝐽2) … ∗ 𝑅𝑒𝑠𝑝G(𝐽G)

where Jn represents the participant’s judgment in trial n, Respn is the response PDF

produced by the model for trial n, and Respn(Jn) is the value of that PDF at Jn. We can

treat the conditioned probability as a product of these functions’ values, since the response function for each trial, as defined before, is conditioned to all previous trials (represented by that trial’s prior).

By Bayes’rule we can obtain the likelihood of a specific set of parameter values:

(29)

𝑃(𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟𝑠/𝑑𝑎𝑡𝑎) = 𝑃(𝑑𝑎𝑡𝑎/𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟𝑠) ∗ 𝑃(𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟𝑠)/𝑃(𝑑𝑎𝑡𝑎)

Since all possible parameter values are considered equally likely a priori, and the data (the actual responses) is the same for all tested sets of parameters values, we can dismiss P(parameters) and P(data) and conclude that P(data/parameters) will be maximal for the same values than P(parameters/data).

The parameter values that are tested in each case are:

1. ampgain: [0, ∞), the sensory layer is only allowed for negative (repulsive) effects

2. σgain, σtuning, σresponse: from 0 to the higher end of the perceptual space (90o).

3. σmemory: from 0 to infinite, since we allow the possibility of complete lack of

memory, equivalent to a flat prior. 4. σrecovery: from 0 to 60 (seconds).

For each considered dataset, the parameters with maximum likelihood are determined by using the Matlab function fminsearchbnd.

Before parameter selection, our experimental data was rescaled to remove systematic biases. The rationale of this provision was as follows. For translating the position of the response bar along a visual analogue scale into variance reports, in our data analysis we employed a linear translation, so that both ends of the response scale corresponded to the lowest and highest RDK standard deviation presented during the training (0o_and

90o_{), and the selected position was assigned a numeric value given the linear distance}

between both ends (for example, the middle point corresponded to 45o_{). These values}

pertained to an unfamiliar dimension, namely the standard deviation of the von Mises distribution for the direction of the individual dots in the RDK. Given the abstract and unfamiliar nature of the judged dimension, the use of a visual analogue scale and the conventional linear translation, it is not surprising that participants’ average judgments, once translated into a number, deviated from the veridical standard deviation, with a marked trend to ‘overestimate’ large StD values - especially since the maximum StD employed in experimental blocks was 60o_{, leaving one third of the response scale toward}

the right end free for such apparent overestimation (see Figure 2a). While this conventional linear translation was valid for analysing trial-by-trial biases in participants’ normalized judgments, our model was not designed to account for systematic biases. As a result, using the unmodified reports would have rendered many judgments highly unlikely under most combinations of parameter values -since most combinations would

produce judgments centred around the veridical value. For this reason, we removed these apparent systematic biases (given by our numerical translation of participant’s judgments) by subtracting the difference between each participant’s mean response for each StD and the veridical StD value. Parameter selection was based on these re-centred judgments.

3. RESULTS

Figure 17. Model: Results. A pattern of recent positive and less recent negative serial dependencies isobserved for the model outputs as well as for participants’ responses. 17a - c present data corresponding to half the participants that took part in Experiment 1 (subset 1, formed by 15 participants, randomly selected out of 30). Model judgments

are based on parameter values fitted for the data of subset 1. 17a. Average relative errors made in the current trial (n) as a function of the variance presented in trial n-1 (StDn-1). Relative errors are defined as REn=(Rn-StDn)/StDn, where

Rn is the reported variance judgment. The blue plot represents participant’s, while the red plot represents model

judgments. In both cases, error bars indicate the between-participant standard error. An ascending slope is observed for both plots, indicative of an attractive bias related to StDn-1.17b. Average relative errors, normalized within

participant and current stimulus (StDn), as a function of StDn-1. Again, the ascending slope of both plots (responses,

outputs) indicates positive serial dependence, although, due to the smaller variability in model outputs compared to real responses, the magnitude of the bias appears much larger for the former. 17c. Fixed-effects coefficients for Bayesian LMMs for the association between StDn-t (t = 1 … 10) and normalized response or model output in the current

trial. Positive and negative serial dependencies are seen at similar timescales for participants’ and model judgments. The inset shows detail for the positions n-2 to n-10. 17d. Analogous analyses as presented in 17c, but applied to the other 15 participants of Experiment 1 dataset (subset 2). Model judgments are obtained by using the parameter values fitted for subset 1.

For parameter fitting, we randomly selected a subset of half (N=15) the participants that took part in Experiment 1 (serial dependence in variance judgments, reported in Chapter 1). In the manner described in section 2.4 of the current chapter, ‘Parameter selection’, we obtained the parameter values that led to the maximum likelihood given the actual data of this subset (henceforth termed subset 1) and the structure of the model. We then ran the model with the fitted parameters on the same trial sequences (the experimental session of the 15 participants in subset 1) and obtained the corresponding model judgments. Finally, we analysed both the participants’ and the model judgments for serial dependencies driven by the stimuli presented in past trials, up to n-10. The methodology of these analyses is the same that we followed for experimental data (Chapter 1).

Figures 17a-c summarize serial dependence analyses on subset 1. The blue plots correspond to participants’ actual judgments (after subtraction of systematic, average bias, see section 2.4), and the red plots to the model’s predicted judgments. Figure 17a presents the average relative response error (defined as REn=(Rn-StDn)/StDn) as a

function of the StD presented in the previous (n-1) trial, StDn-1. Both plots, corresponding

to participants and model judgments, exhibit an ascending slope, indicating an attractive effect on current judgment in relation with the previous stimulus. In Figure 17a

participants and model judgments and unnormalized; on visual inspection, the size of the attractive bias (represented by the slope of the plot) seems similar for actual responses and model outputs. By contrast, when judgments are normalized within participant and current StDn, as presented in Figure 17b, the effect size appears to be

much larger for the model. This apparent discrepancy is explained by the much lower variability in model judgments compared to participants’ real data - since the model always selects the optimal judgment, i.e. the peak of the posterior probability distribution. Thus, z-scores in normalized model outputs represent a smaller increase in terms of unnormalized values, compared to participants’ responses.

In a similar manner than for experimental data (see Figure 2d for example), we analysed serial dependence on model outputs by Bayesian LMMs with normalized judgment as dependent variable, and StD in a past trial position as independent variable. Figure 17c presents the fixed-effects coefficient estimates of 20 (10x2) Bayesian LMMs for participants’ normalized response (zREn) -blue plot- or normalized model judgment -red

plot- as dependent variable, evaluating the effect of StDn-t (t = 1...10, each position

assessed in a separate model), with random effects grouped by participant’s ID. The coefficient estimates represent the increase (in z-scores) observed in normalized responses or model judgments for each 1o_{of increase in previous (n-t) StD. A positive}

bias driven by previous StD is observed in relation to positions n-1 and n-2, while a negative effect arises for more remote positions, mainly n-7 and n-8. The timescale of both types of history-dependent bias is similar for real participants’ and model judgments. Although the positive effect driven by n-1 trial appears much larger for model outputs (B= 0.0489, 95% credible intervals (0.0484 - 0.0494)) compared to participants’ data (B= 0.0029 (0.0008-0.0049)), this is explained to a great extent by the differences in normalized scores, as shown in Figures 17a and 17b. On the contrary, the negative effect by less recent history seems of a similar magnitude for normalized responses and model outputs, but is weaker for the latter if unnormalized outputs are considered. This negative effect is statistically significant at positions n-3, n-7 and n-8, peaking for StDn-7: B=-0.0018 (-0.0032 - -0.0003). The inset graph in Figure 17c presents

the same results as the main plot, leaving aside position n-1 to better appreciate the effect size of serial dependence from n-2 to n-10. The progression of the positive and negative serial dependencies (considering normalized reports) is similar for real data and model outputs.

We then enquired whether the same parameter values (fitted for subset 1 data) could produce similar serial dependencies on a different trial sequence. Thus, we ran the model on the remaining 15 participants of Experiment 1 dataset (subset 2), and analysed serial dependence in relation to the obtained model judgments. This analysis is depicted in Figure 17d, analogous to Figure 17c but concerning subset 2 data. A similar pattern of positive and negative serial dependencies is observed in this case.

In conclusion, we built a model on the basis of our conclusions about the origin of the two opposite history-dependent biases in perceptual judgments that were observed in our experimental data: negative biases of likely sensory origin and positive serial dependencies of presumed decisional basis. Operationalized in a similar manner as previous models of sensory adaptation and Bayesian decision theory, respectively, these biases are obtained for the model outputs in a similar magnitude and timescale as encountered for human participants.

In document Contextual modulation of visual variability: perceptual biases over time and across the visual field (Page 191-199)