Learning mechanisms - Representation and Interaction of Sensorimotor Learning Processes

Error-based learning

The sensory predictions made by the internal forward models are essential in detecting environmental or body-related changes (Shergill et al., 2003). The discrepancy between the predictions of the forward model and the actual sensory feedback is referred to as the sensory prediction error, which informs the motor system of a change in the dynamics of the world. How does the motor system respond to the sensory prediction error and utilise them to adapt the internal models? It has been shown that properties such as the error size, the source of error or its reliability majorly affect the way we adapt to or learn new sensorimotor transformations (Berniker and Kording, 2008; Criscimagna-Hemminger et al., 2010; Izawa and Shadmehr, 2011).

The effect of error size on adaptation has been typically examined by comparing abrupt versus gradual exposure to force field perturbations. For instance, it was shown that when errors were small (due to gradual perturbations) adaptation led to longer lasting aftereffects (Hatada et al., 2006), and smaller rate of decay (Huang and Shadmehr, 2009). Also, studying patients with severe degeneration of cerebellum, it was shown that when patients were exposed to sudden force field perturbations (which led to large errors), they showed deficits in adaptation (Rabe et al., 2009). However, when the force field was applied gradually over many trials (causing small errors), significant improvement in adaptation was observed (Criscimagna-Hemminger et al., 2010). It was suggested that large versus small error sizes in force field learning involves different adaptive mechanisms in the motor system. These findings, however, seemed to be specific to dynamic learning, as visuomotor rotation studies

1.2 Learning mechanisms 7

in which patients were exposed to gradual versus abrupt visuomotor transformations failed to show similar effects (Schlerf et al., 2013).

Studies have also examined how adaptation scales with the size of error. In theoretical models of sensorimotor learning, it is usually assumed that adaptation increases linearly as a function of error magnitude (Donchin et al., 2003; Lee and Schweighofer, 2009; Smith et al., 2006; Thoroughman and Shadmehr, 2000). However, existing behavioural data suggests that the relationship between the error size and the extent of learning is nonlinear (Marko et al., 2012; Robinson et al., 2003; Wei and Körding, 2009). For instance, it has been shown that the amount of learning is relatively larger for small errors, and rapidly saturates as the error size increases (Marko et al., 2012; Wei and Körding, 2009). A recent study further proposed that even for a fixed error size the amount of learning could vary depending on the consistency of the environment (Herzfeld et al., 2014). It was shown that when the environment was inconsistent (error direction changed frequently due to random perturbations), the amount of learning from a given error value decreased. Whereas, in a consistent environment (i.e., the perturbation direction was consistent), the same value of error led to larger amount of learning. These results indicated that learning is modulated not only by the size of error, but also by its reliability.

Learning from error may also take place at different rates. There has been a great debate as to what determines the speed with which we learn a sensorimotor transformation (Burge et al., 2008; Gonzalez Castro et al., 2014; van Beers, 2012; Wei and Körding, 2010). A popular approach in this regard has been the use of Bayesian principles, based on which the variation of learning rate is attributed to two different sources of uncertainty in the environment: the feedback uncertainty, and the state uncertainty. For example, in visuomotor rotation studies, it was found that humans learn the task at a slower rate when the visual sensory feedback of the hand location is blurry (feedback uncertainty). In contrast, when the visual feedback is sharp, but systematically deviated from the actual hand location (state estimation uncertainty) learning takes place at a faster rate (Burge et al., 2008; Wei and Körding, 2010). This behaviour could be explained by a Kalman filter model, in which the learning rate was formulated by the Kalman gain as a function of uncertainty in feedback

8 Introduction

and state estimation (Burge et al., 2008; van Beers, 2012):

K≈ σ_x²

(σ_x²+ σ_u²) (1.1)

where, K was the Kalman gain (the learning rate), and σ_xand σ_urepresented the uncer-tainty (noise) in state estimation and visual feedback, respectively. Predictions of the Kalman filter were in agreement with some of the experimental observations, but also failed to explain some others. For instance, in the presence of random perturbation in visual feedback (which increased the feedback uncertainty), Kalman filter predicted slower adaptation rate, whereas no significant change in adaptation rate was observed in human behaviour (Burge et al., 2008).

Recently, Gonzalez Castro et al. (2014) suggested that the rate of adaptation is primarily determined, not by the estimation of state or feedback uncertainty in the current trial (i.e., the Kalman gain in equation 1.1), but by the predictability of the environmental changes in the future (how likely is it that the current experienced perturbation be repeated in the future). Experimental data from force field learning showed that when the perturbations were persistent (predictable), learning rate increased, whereas, under rapidly changing (unpredictable) perturbations, the rate of learning decreased. Similar findings were reported in a study in which the learning rate was attributed to the predictability of error signals in the future (prospective errors; Takiyama et al., 2015). As such, in an environment where errors were consistent and predictable, learning took place at a faster rate than in environments in which error direction changed randomly and frequently. Taken together, these studies shed light on how error signals, as the difference between the predicted and perceived sensory information, are used to adapt internal representation of the body and its surrounding world.

Reward-based learning

Motor learning, besides sensory prediction error, is influenced by motivational feedback such as reward and punishment. Subjects have shown the ability to learn visuomotor rotation tasks purely based on reward-based feedback (success or failure in hitting the target), whether

1.2 Learning mechanisms 9

the rotation was imposed gradually (Izawa and Shadmehr, 2011) or abruptly (Nikooyan and Ahmed, 2015). Reward-based feedback can have a strong complementary effect on learning when provided along with the prediction error. Studies show that when error and reward feedback are both available, the learning process is significantly accelerated (Kojima and Soetedjo, 2017; Nikooyan and Ahmed, 2015; Quattrocchi et al., 2017). In addition, reward has been shown to affect exploratory features of motor variability. In a reaching task towards an invisible rewarded target, subjects increased their reach variability as the probability of reward decreased (Pekny et al., 2015). This was interpreted as a search for target locations which would be rewarded. In contrast, when the average reward increased around a target position, the variability decreased.

Recent studies have also examined the effects of punishment in sensorimotor adaptation.

Galea et al. (2015) demonstrated that punishment and reward had distinct effects on the learning process during a visuomotor rotation task. They found that punishment (negative feedback) increased the speed of learning more than reward (positive feedback), but that reward led to larger retention of learning. It was suggested that reward and punishment influenced sensorimotor learning via separate mechanisms. The same group of researchers further studied the effects of reward and punishment on patients with stroke during a force field adaptation task (Quattrocchi et al., 2017). Similarly, they found that reward and punishment both increased the rate of learning in stroke patients to the level that was comparable to healthy age-matched subjects who performed the same task without reward or punishment. Interestingly, the retention of learning in the patients who received reward feedback exceeded that of healthy control subjects.

The learning patterns observed based on reward versus error-based feedback have shown to be fundamentally distinct. For instance, it has been shown that reward-based learning generalises only locally, leads to larger movement variability, and does not update the sensorimotor mapping between the hand and the cursor position (Izawa and Shadmehr, 2011). In a recent study, Cashaback et al. (2017) dissociated the effects of error-based and reward-based feedback in a reaching task. Subjects were exposed to lateral shifts of the cursor with respect to the hand position, where the magnitude of shift was sampled randomly

10 Introduction

from a distribution whose mean and mode were separated. It was shown that when provided with reward feedback, subjects learned to compensate for the shifts based on the mode of the distribution; when error-based feedback was provided, subjects learned the task based on the mean of the distribution; and, when both error and reward feedback were available, subjects continued to adapt based on the mean of the distribution. These results suggested that reward-based and error-based learning mechanisms are separate, and that motor learning is predominantly driven by the error-based processes (Cashaback et al., 2017; Izawa and Shadmehr, 2011).

Use-dependent learning

In error-based and reward-based learning, motor commands are adjusted in a direction that would reduce error and increase reward. Recent studies have revealed a new phenomenon in which motor commands have a tendency to generate movements that are similar to previous movements. That is, when reaching movements are made repeatedly in a particular direction, the future movements will be biased towards that direction (Huang et al., 2011; Verstynen and Sabes, 2011). This induced bias towards previously performed movements is called use-dependent plasticity or use-dependent learning. Diedrichsen et al. (2010) demonstrated that use-dependent and error-based learning can simultaneously affect behaviour. Subjects performed reaching movements towards a horizontally elongated target, while their move-ments were constrained by channel trials (simulated one-dimensional spring walls that guided the movements in a straight line). The channel trials guided the movements in a direction that was slightly deviated from a straight movement subjects normally performed without the channels. This caused an adaptive response by the subjects due to the imposed directional error, such that when the channels were removed, subjects showed marked after-effects in the opposite direction to the channels (error-based adaptation). However, after a few trials, their movements started to converge back towards the initial direction imposed by the channel trials. This tendency towards the repeated movement direction caused by channel trials indicated a use-dependent after-effect.

In document Representation and Interaction of Sensorimotor Learning Processes (Page 32-37)