5.1 Summary of findings and discussion
5.2.1 Methodological critique of the Two-Step task
With its elaborated design and sophisticated computational model of task performance, the sequential decision-making task designed by Daw et al. (2011) tried to solve the difficulties in operationalizing goal-directed and habitual control in human research (see section 1.2.3.2). However, there is accumulating evidence that the interpretation of findings using the original Two-Step task might be more challenging than previously thought (Akam, Costa, & Dayan, 2015; Kool, Cushman, & Gershman, 2016; Kool, Gershman, & Cushman, 2017; Toyama, Katahira, & Ohira, 2017). Recent theoretical advances underpinned with simulations and empirical data showed that model-based behavioral strategies in the Two- Step task do not lead to increased reward rates for participants, in turn decreasing the motivation to invest the mental effort to behave goal-directedly (Akam et al., 2015; Kool et al., 2016). This might be due to the relative difficulty of distinguishing high from low reward probability options on the second stage of the Two-Step task due to the limited information entailed in the dichotomous outcomes of 2nd-stage choices (rewarded vs. unrewarded) and, thus, the limited insight in what the best stimulus might be at present. After all, one can only sample reward information from one 2nd-stage stimulus at a time and needs several repetitions
to deduce the corresponding reward probability for this stimulus, and all the while the reward probabilities for the other three 2nd-stage stimuli change unobserved. Therefore, a model-
based strategy has decreased importance at 1st-stage choice. This is worsened by the rather
small differences in reward probabilities between options at 2nd stage resulting from the
reflecting boundaries of the Gaussian random walks at 0.25 and 0.75. As a result of these boundaries, two different 2nd-stage stimuli can have a maximum difference in reward
106
probability of 50% when one is at the top and the other at the bottom of the possible probability distribution, but on average the winning probability of two 2nd-stage stimuli differs by only 18% (Kool et al., 2016). Another potential disadvantage of the currently used setup of the Two-Step task is that the rare transitions lead to a decrease in the accuracy-effort association, because model-based control at the 1st stage might not end up in the desired state
on the 2nd stage. Consequently, investing mental effort to exert model-based control was not
associated with increased accuracy of performance and more reward (Kool et al., 2016), thus, frustrating the participant and further decreasing the motivation to invest effort. Taken together, though the Two-Step task might be well designed to measure the balance between model-free and model-based strategies in theory, participants have no incentive to exert goal- directed control over their behavior due to the task setup. This issue confounds the previously obtained results as it is not apparent whether the individually estimated amount of goal- directed control indeed corresponds to the ability of the participant or rather is a result of demotivation (which would even be cost efficient given the lack of benefit from investing cognitive effort). I have performed several additional analyses to test the influence of these shortcomings of the task design in our samples.
5.2.1.1 Additional analyses – Goal-directed control, motivation, reward, and cognitive abilities
First, we asked the participants to rate their motivation to choose the currently best stimulus in the Two-Step task and how this motivation changed over the course of the experiment right after they performed the task paradigm (Figure 13, left panel). These ratings revealed that participants on average reported to be rather motivated during task performance (Mdn = 5, mode = 6 on a scale ranging from 1 = little motivated to 7 = very motivated), but motivation decreased on average over the course of the task (Mdn/mode = -1 on a scale ranging from -3 = decreased to 3 = increased). Furthermore, ratings of motivation correlated positively with parameters of model-based control (MBscore; Spearman's ρ = .153, p = .037)
but the change of motivation did not (see Figure 13, right panel). As you can see in the upper right panel of Figure 13, every participant in the highest 15% of model-based behavior (MBscore > 0.59, vertical line) was motivated to perform well (motivation rating ≥ 5).
107
Figure 13. Upper panel: Histogram of Sample 1 participants’ answers to rate their motivation to choose the currently best stimulus in the Two-Step task on a seven-point Likert scale (left) and the correlation of this rating with the score for model- based control strategies derived from the individual stay probabilities (MBscore;Spearman's ρ=.153, p=.037). The horizontal
line in correlation plot represents the middle category of the Likert scale, the vertical line represents 85th percentile (MB score
= 0.59). Lower panel: The same as in the upper panel but with participants’ ratings of how their motivation has changed during task performance (Spearman's ρ=.053, p=.474; N=186 for both panels due to two missing data points in the ratings).
Although this is speculative and exploratory, this correlational pattern suggests that there might be three subgroups in this sample regarding their Two-Step behavior: one group being motivated and exerting goal-directed control although it is not profitable (upper right quadrant in correlation plot, corresponding to 15% of the sample), one group lacking motivation and ranging from low to medium goal-directed control (lower left quadrant in correlation plot, corresponding to 9% of the sample), and one group being motivated to perform well but nevertheless yielding a rather wide range of the amount of goal-directed control (upper left quadrant in correlation plot). Thus, the motivation to perform well in the Two-Step task in this sample seems to be a necessary but not sufficient condition for high amounts of goal-directed behavior, but reported motivation cannot explain low to medium amounts of goal-directed control. In general, this shows that motivation was not decreased in
108
most of our participants and the amount of goal-directed control varied widely. Additionally, motivation to perform well in the task was similarly distributed in AUD patients (Mdn = 6, mode = 6) and control participants of Sample 2 (Mdn = 5, mode = 6), and did not differ between these two groups (z = -1.35, p = .179 4). These additional findings do not invalidate the critique on the Two-Step task, but indicate that our results might not have resulted from pure demotivation of the participants.
Moreover, Kool, Cushman, and Gershman (2016) found no association between the degree of goal-directed control and the average reward rate in the Two-Step task. This finding is evident in our young-adult sample, too (MBscore x average reward rate; Spearman's ρ = .016,
p = .832). Nevertheless, participants’ 2nd-stage choice behavior was not random. The four 2nd-
stage stimuli can be brought into an order from the lowest to the highest reward probability associated with them at each trial for each participant and ranked accordingly (from 1 = lowest to 4 = highest probability). With the help of this coding scheme, I could classify each choice depending on its relative optimality and calculate a rank sum score of this choice parameter. By dividing this rank sum by the amount of trials (201 minus missing trials), I obtained the average rank of the chosen 2nd-stage stimuli. Examining this rank score yielded
two important insights: first, the average reward score over the group differed significantly from 2.5, which would have signified random choice behavior on 2nd stage (t(187) = 17.30, p
< .001; see Figure 14, left panel). Second, the amount of model-based control during the Two- Step task correlated positively with the average rank of chosen stimuli (Spearman's ρ = .171,
p = .019). These analyses replicate the finding of Kool, Cushman, and Gershman (2016) that
goal-directed behavior in the original Two-Step task does not benefit the participant in terms of an overall increased reward rate. However, they also suggest that participants in our samples tried to perform well despite the lack of benefit from exerting mental effort, again suggesting that task performance was not just a result of demotivation or random choice behavior.
A third approach to put the validity of our Two-Step findings to the test is by looking at the association with cognitive abilities. Previous studies have found that higher working memory capacity and cognitive processing speed was related to more model-based goal- directed control in the Two-Step task (Schad et al., 2014; Smittenaar et al., 2013), though this relation might be mediated by acute (Otto, Raio, Chiang, Phelps, & Daw, 2013)and chronic stress (Friedel et al., 2017).
109
Figure 14. Display of choice behavior of young-adult sample on 2nd stage of the Two-Step task. Left panel: Histogram of the
average rank score of chosen 2nd-stage stimuli. Right panel: Scatterplot of average rank score of chosen 2nd-stage stimuli and
model-based control parameter calculated from stay probabilities (MBscore). Horizontal line in both plots at 2.5 represents
theoretical random choice behavior.
Indeed, additional correlational analyses of the association between model-based control and cognitive abilities in Sample 1 revealed significant correlations of the score of model-based behavior (MBscore) with performance in the Trail Making Test B (TMT-B;
Spearman's ρ = -.217, p = .003; Reitan, 1992), digit span backwards (DSbw; Spearman's ρ = .198, p = .006; Wechsler, 1997) and digit symbol substitution test of the Wechsler Adult Intelligence Scale (DSST; Spearman's ρ = .204, p = .005; Wechsler, 1997), and Mehrfachwahl-Wortschatztest Version B (MWT-B; Spearman's ρ = .183, p = .012; Lehrl, 2005). These correlations signify an association between more model-based behavior in the Two-Step task to be associated with faster cognitive processing speed (TMT-B, DSST), higher working memory capacity (DSbw), and a higher proxy for verbal intelligence (MWT- B). Taking together the results of these additional analyses indicated that the amount of goal- directed and habitual control utilized in Study 1 and 2 represented the actual abilities of the participants at least in part, which suggests our results might not be invalidated by the methodological critique of the Two-Step task. But at the same time, the associations of Two- Step task performance with individual cognitive abilities show the necessity to investigate this link further in health as well as their possible alteration in disease (see section 5.3.1).
110