EXPLORE VERSUS EXPLOIT AND EXECUTIVE FUNCTIONS

CHAPTER 5: BIWAYS AND PITFALL

2. CAUSES OF SUBOPTIMAL BEHAVIOUR

2.2 EXPLORE VERSUS EXPLOIT AND EXECUTIVE FUNCTIONS

I believe the Serialbox provides some insights into how chimpanzees optimise behaviour when both solution A and B are complex, encompassing support for both Hypothesis 1 and 2, and highlights the multi-faceted nature of behavioural change when examining behaviours which better approximate cumulative culture. From Serialbox Experiment 1, when chimpanzees could still successfully forage with their established method, only a small minority relinquished their old solution and flexibly upgraded to a more efficient alternative. Although two out of five naïve controls were able to perform Solution B, it took them around two hours before doing so. Three other individuals failed to solve either Serialbox Solution A or B despite interacting with the box. This is in contrast to Biways and Pitfall where Solution B was converged on much earlier, and by all participating chimpanzees. This makes it more difficult to conclude definitively that having a prior solution in and of itself hindered behavioural optimisation within the Serialbox task. Instead, alternative explanations (Hypothesis 2) may also explain suboptimal behaviour. When individuals had a working solution (Experiment 1) the majority did not use social information or exploratory behaviour to upgrade to a more efficient behaviour. We might consider that under certain circumstances, and as outlined above, when a known solution is still functional, chimpanzees may not be willing to invest in learning a new one. Specifically, within the Serialbox task, high levels of response prepotency and the complexity of solutions, may have militated against behavioural change, as whilst behaviourally the alternative method is more efficient, cognitively, it may be quite costly to relinquish an old solution through inhibition, and learn a new solution through what may be a lengthy trial and error process (Holmes & Cohen, 2014). Additionally, the benefit of this may have been minimal, with gains in response efficiency not motivating investment. As we found no effect of social information on behaviour in Experiment 1, we should also consider that having a working solution may decrease attending to relevant external cues; this is somewhat reminiscent of a ‘copy-when dissatisfied’ social learning strategy, whereby an individual is most attuned to social information when their personal strategy is unsatisfying (Laland, 2004; Marshall-Pescini & Whiten, 2008; Yamamoto et al., 2013 see also Braet et al., 2009; Hester et al., 2009 for related arguments relating to attention and behavioural inhibition in humans). However, given we found an effect of social information in both Biways and Pitfall, when chimpanzees also had working solutions, we should be cautious in that interpretation.

Chapter 7 120

In Serialbox Experiment 2, by partially blocking the initial solution, we altered the nature of both solution A and B: The decreased reliability of established behaviour A perhaps reduced response prepotency. Although participants perseverated quite strongly with this now extremely inefficient response, with only a 25% success rate, it is highly likely that over time, response prepotency decreased due to repeated failure. The repositioning of the token meant that participants no longer had to build on Solution A to achieve Solution B. This may have simplified Solution B somewhat, although the action of pulling open the door still remained novel to participants. Again, in line with human studies, given the extensive practice chimpanzees had with their original inefficient solution, and the cognitive load of behaviours, it is perhaps expected that we see high levels of behavioural conservatism with emerging behavioural flexibility: with time, new solutions may be converged upon due to a combination of exposure to alternative solutions as well as the weakening of response prepotency (Wiley, 1998); that is, behaviour resulting from the combination of personal experience with social information (Derex et al., 2015; Mesoudi, 2011a; Rieucau & Giraldeau, 2011; Whalen, Cownden, & Laland, 2015) . This style of learning has been suggested to underlie not only the acquisition of complex behaviours in wild chimpanzees but also technologies in our hominin line (Whiten, 2015). It is not possible to quantify exactly how social information was incorporated into the behaviour of the chimpanzees in Experiment 2, as this would be likely confounded with social information in Experiment 1; that is, we cannot say for certain that chimpanzees did not acquire knowledge about Solution B in Experiment 1. However, interestingly, studies with humans indicate a role for increased attentional control after failures on tasks (Braet et al., 2009; Hester, Madeley, Murphy, & Mattingley, 2009).

This highlights an important consideration of how social information is used. The open diffusion methodology in Chapters 5 and 6 is such that continued use of Solution A by participants in our experimental groups (IPSI in Biways and Pitfall, social information groups in Serialbox), will positively correlate with the number of observations of Solution B by the model: both increase as a function of time. To avoid confusion within analyses, I coded social information in a binary fashion: you either have it or you do not. However, to paraphrase a reviewer who critiqued the Serialbox study during the publication process, it could be that chimpanzees reached what may be considered a ‘threshold of information’ which then resulted in behavioural change. This is a possibility, and additionally, it may also be that with time, we should expect some spurious exploration. However, I would argue that the variability we see in chimpanzee perseveration across tasks, not only in my own work but in that reviewed in Chapter

2, may complicate this picture; for example, Marshall-Pescini and Whiten (2008) found most chimpanzees did not adopt Solution B despite witnessing over 180 demonstrations. To examine if results could be explained by some information threshold, I compared the number of solutions taken to converge on solution B between Biways and Pitfall study 2.1, where no or relatively minimal perseveration was seen, with solutions taken in Pitfall Study 2.2, where evidence of perseveration were found (note that this topic has already been specifically addressed for the Serialbox in Chapter 6). If behavioural change is underlain by reaching some threshold of information, we should expect to see adoption of Solution B at similar rates across these studies, and that perseveration is a result of not having reached that critical threshold. Analyses show this is not the case, with individuals within Biways and Pitfall Study 2.1 (no effect of prior solution) adopting solution B after an average of 26 (range 3-54) and 7 (1-18) witnessed solutions respectively. In contrast, those in Pitfall study 2.2 (effect of prior solution) witnessed Solution B 10 (range 1-20) times before adopting it. Naïve individuals converged on solution B in both the Biways and Pitfall 2.2 studies after 4 (range 0 – 14) and 5 (range 0-10) social observations respectively. This indicates that social information as measured on a continuous scale does not easily explain the pattern of perseveration seen, and ultimately, very little information was needed about Solution B to adopt it. In general though, the methodology I have used does not lend itself to this form of analysis: a dominant female always modelled Solution B, with lower ranking chimpanzees having to wait until she moved away from the appar atus before they could participate. Typically this resulted in a high number of observations of the model, with relatively fewer opportunities to personally interact with the task. Overall however, data is most consistent with perseveration being linked to the complexity of solutions.

I believe the picture changes when we consider behaviour where solution complexity precludes either complete innovation by a single individual (as in true cumulative culture), or behaviours which are only within the capabilities of a rare innovator (Whiten et al., 2009). Here we might expect an interaction effect between having a prior solution and exposure to social information considered on a continuous scale (as opposed to the binary analyses I employ). As highlighted in Chapter 1, when attempting to learn a complex behaviour, there are likely to be repeated learning attempts, which should incorporate both social and trial and error learning (e.g. Whiten, 2015). Therefore, when looking at behavioural flexibility which involves a complex Solution A and a complex Solution B, it is hard to disentangle the effects of social information from the effects of prior solution on adoption of Solution B, as here social information may be needed for acquisition (as opposed to only facilitating it as in both Biways and Pitfall). Further,

Chapter 7 122

other considerations that I’ve highlighted above need to be minded, such as how likely an agent is to invest resources in learning a complex behaviour, especially if they already have a working solution, and how that will be affected by the way in which information is extracted from observations (i.e. is process information necessary). This suggests we need to consider explanations that incorporate both Hypotheses 1 and 2.

Returning to the Serialbox, Experiment 3 mirrors Pitfall Study 2.1, where chimpanzees were tasked with combining known behaviours (or those well within their innovative capabilities) to optimise outcome. In both these studies, chimpanzees readily built on behaviours, indicating that chimpanzees have little problem with accumulation when composite elements are known to them (cf Manrique et al., 2013). Interestingly though, note that in both these conditions, chimpanzees were not necessarily having to inhibit a prepotent response. This is in contrast to both Serialbox Experiment 1 and Pitfall study 2.2, where inhibition of a well-practiced complex action sequence is required, and where perseveration was evident. Further, both Serialbox Experiment 1 and Pitfall 2.2 not only involve a higher level of complexity than Biways (where inhibition was also required) but they also differ on another feature; the optimum behaviour (Solution B) involves only a partial inhibition of Solution A, and incorporates components of A (lift lid in Serialbox and slide box in Pitfall). This is reminiscent of some technological accumulation, whereby, for example, the construction process is interrupted at some mid-point and modified, as opposed to building onto the end or fully relinquishing the variant. It may be that using elements of Solution A primes the full expression of A, making inhibition at some intermediate point more difficult than if no elements of Solution A had been employed (Houghton & Tipper, 1994). For example, Cragg and Nation (2008) found on closer inspection of a Go/No-go task that a substantial amount of inhibitions involved first initiating the response and then successfully terminating it before completion, with older children (9-11 years versus 5- 7 years) being better at this i.e. successful inhibition may not be necessarily characterised by totally relinquishing a response, but rather, an ability to terminate a response part way through, and this may be dependent on cognitive resources (as evidence by the effect of age). Interestingly, this draws parallels with research in habit formation, where evidence suggests that in chunked behaviours, execution of the initial element of the chunked sequence is a powerful predictor of whether or not the full habitual response is expressed (Smith & Graybiel, 2016). These parallels raise an interesting question of how we define responses in these tasks: are they goal oriented or are they under habitual control? I address this in the next section.

In summary, behavioural inflexibility is not caused by just one factor. It is very likely that not only are there multiple variables affecting behavioural optimisation, but that they are highly interconnected; for example social learning strategies (when to engage in social learning and from whom to learn (Laland, 2004)) are an extension of learning heuristics and explore versus exploit decision algorithms. These in turn are likely linked to the types of behaviours involved, with behavioural complexity affecting decisions, as well as the use of social information. Both of these are dependent on cognitive resources, such as selective attention to environmental and internal cues signalling the potential cost/benefits of behavioural change (Braet et al., 2009; Hester, Madeley, Murphy, & Mattingley, 2009; Padmala & Pessoa, 2010; Rushworth et al., 2012; Theeuwes, 2010), holding in memory representations of both past and potential behavioural variants, identifying the relevant action sequences, and successfully inhibiting or adding action elements to these sequences.

In document The context of behavioural flexibility in chimpanzees (Pan troglodytes) : implications for the evolution of cumulative culture (Page 127-131)