Theta burst stimulation and ballistic motor learning
5.5 Modelling the ballistic motor learning task
A simple theoretical model was designed to show how a performance improvement may be achieved with minimal assumptions about how learning would occur. The aim of creating such a model was to allow us to investigate the effects of altering
performance variability on the outcome, and thereby to determine whether it is feasible that such variability may have a positive effect on learning. This model was based on 2 assumptions:
1) that there is a maximum physically achievable peak acceleration; and
2) that the motor cortex has a fixed repertoire of possible outputs, each coding for different muscle groups, which can be discharged in parallel.
Maximising the motor output would therefore involve determining the optimum weighting in which these motor outputs are to be discharged, presumably favouring task agonists over task antagonists. Thus the system must gradually solve a multi-dimensional problem using feedback given in one dimension, in the form of visual feedback from the previous trial. For the sake of our model we reduced the motor output repertoire to 2 dimensions, represented by 2 orthogonal axes x and y. The contribution of each axis to the observed motor output was defined by the same
exponential function, such that for a given combination (x,y) the observed output is
This function was chosen for the simple reason that it generates a motor output function with a single peak value, achieved at optimum values of x and y (see Figure 5.3a). In our model the following strategy was applied in an iterative algorithm:
1) Test the motor output at a random test point centred around the current search-centre coordinates (x,y). The distance of this test point from the current search-search-centre coordinates obeys a normal probability density function, with a variance that remains fixed (the Output variance, OutputVar).
2) If the resulting output is an improvement on the best output so far, then the search-centre coordinates are updated – see step 3. Otherwise these coordinates remain unchanged and the model returns to step 1 (next iteration).
3) The search-centre coordinates are moved by a fixed proportional distance along a line joining the current search-centre with the system’s perception of the most recent test coordinates. This perception of the test coordinates is not identical to the actual coordinates just used, in order to reflect a degree of error in both recalling the motor output just generated and in interpreting the afferent feedback from the resulting movement. The distance of the perceived test coordinates from the real test coordinates also obeys a normal probability density function with a fixed variance (the Perception variance, PerceptVar). With the new performance coordinates, the model returns to step 1 (next iteration).
We term the extent by which the search-centre coordinates are adjusted (initially 50%) the Learning gain (LearnG), with a higher value denoting a greater degree of
thus reflects the capacity for plastic change in this model. It may be noted that the model is simply an iterative implementation of a fixed set of rules – it does not include the capacity to discover underlying rules that find a solution more efficiently.
This model is represented in flow diagram form in Figure 5.3b. The initial test coordinates were always (50, 50), and 100 iterations were performed in the course of each run. The effects of varying either the OutputVar or the PerceptVar were
examined by running the model 20 times at each set of values across a range and recording the resulting outputs. For each run of the model, the final output achieved was recorded as the mean of the last 10 trials (out of 100).
Figure 5.3: Model of a possible strategy for the ballistic learning task
We designed a simple iterative model by which an improvement in performance may be achieved in this task. The aim is to identify the optimum weighting of 2 simultaneously-discharged motor outputs (x and y), using the available performance feedback in the form of observed motor output.
(a) The motor output function employed in the model. The observed motor output (vertical axis) was defined as the sum of the contributions of the 2 individual motor outputs (x and y), each of which obeyed an inverse exponential function such that there is a single peak which represents the maximum possible peak acceleration. In the contour view (right), the asterisk represents the starting position.
(b) The structure of the model is given in flow diagram form. Random combinations of x and y are tried: if the resulting output of a given trial represents an improvement on prior performance then the search centre is moved in the perceived direction of the new coordinate. There are 2 distinct sources of variance, which remain fixed within a given run of the model: Output variance reflects the distribution of trials around the search centre, while Perception variance reflects error in the correct recollection of the previous trial coordinates.
The effects of changing the OutputVar (with PerceptVar set at 1 and LearnG set at 0.2), the PerceptVar (with OutputVar set at 7 and LearnG set at 0.2) and the effects of changing the LearnG (with OutputVar set at 7 and PerceptVar set at 1) are shown in
Figure 5.4. At each setting the final performance depended crucially on the variable tested.
For OutputVar (Figure 5.4a), low values resulted in poor final performance, with little improvement across the 100 trials. Increasing OutputVar was initially beneficial, but beyond an optimal value further increases resulted in impaired performance. For LearnG (Figure 5.4c), increases resulted in a similar inverse-U-shaped curve with an optimum value beyond which further increases were detrimental to performance. For PerceptVar (Figure 5.4b), by contrast, there was no optimal value – increasing this form of variability resulted in a steady decline in performance. Detailed analysis of the interaction between these variables demonstrates that this principle holds true for all values of these variables (Figure 5.4). Thus in this simple model of a learning strategy we observed a complex interaction between variability and learning, with divergent effects of the 2 forms of variability tested on final performance. For OutputVar and LearnG, there is an optimum range where maximal performance gain occurs while any increase in PerceptVar is detrimental.
Fig 5.4: The interaction of performance variability, perception variability and learning gain in the ballistic learning model The model was run 20 times with each setting of learning gain (LearnG), and the final performance recorded as the mean of the last 10 trials (out of 100).
(a) For output variability (OutputVar) the U-shaped curve persists across the 3 values of LearnG tested: with increasing OutputVar learning initially increases and then drops off, such that there is an optimum value of OutputVar for each setting of LearnG. Interestingly, the curves for the low and high values of LearnG intersect.
(b) For perception variability (PerceptVar) the curve is similar across the 3 values of LearnG tested: increasing PerceptVar is consistently detrimental to learning outcome.
(c) For learning gain (Learn Gain) the curve demonstrates a U-shaped tendency which holds true whatever value of OutputVar.
The interaction between LearnG and OutputVar suggests that in this model there is an interactive relationship between plasticity (LearnG) and output variability: when the capacity for plasticity is low then effective learning is favoured by greater output variability, while in the context of greater plasticity less variability is favourable.
5.5.1 Implications of the model
The idea that increasing performance variability may improve learning may initially seem counter-intuitive. However, there are situations in which variability can drive performance change. In our simple model of the task, we assume that the subject is
unable to formulate a set of rules to aid improvement from the information available about the current trial. This contrasts, for example, with sequence learning in which knowledge of the sequence in itself predicts the optimal output on each trial and is validated by the fact that there was no improving probability of learning between blocks within a session.
In the model, the drivers to performance change were random variation in motor output (OutputVar) and in the memory of the previous trial’s output (PerceptVar). The effects of altering the two forms of variability were fundamentally different. While variability in accurate recollection of the previous trial (PerceptVar) was entirely detrimental effect to final performance, the same was not true for variability in search-centre coordinates (OutputVar) where an inverted U-shaped curve was observed.
Increasing OutputVar allows the model to try a wider range of combinations, allowing the system to ‘escape’ a performance plateau and continue improving. This is akin to a selection process where a degree of diversity allows a gradual evolutionary process to occur. On the other hand, excessive OutputVar adversely affects the reproducibility of good movements so that learning suffers. The LearnG determines the extent of output adjustment made in response to an improvement and so reflects the degree of plasticity available. A relatively small amount is required for optimal learning, beyond which improvement declines. Impaired performance at LearnG higher values were explained here by greater system variability, suggesting a complex interaction between plasticity and variability. Thus, a highly variable system would benefit from less plasticity (due to the risk of learning an error) while a less variable system would benefit from greater plasticity.
These results are obtained from a simple model that shows how the motor system might operate in the absence of rule-based optimisations. Although an extreme
example, it does serves to illustrate that a beneficial role for output variability in learning is at least feasible. It is interesting that even in such a simple system there is a clear interaction between variability and plasticity with respect to net performance gain. We sought now to test whether performance variability in our experimental data from experiment 1 is associated with improved performance.