Results - Simultaneous incremental neuroevolution of motor control, navigation and object manip

Figures 6.2 and 6.3 show the progress of the two different types of run undertaken over evolutionary time: populations evolved from random starting points and from populations previously successful in the 3D RC task. One extremely high-fitness run was noted in the random category; upon visual comparison with other runs, this species is a classic degenerate solution whose strategy is to rapidly vibrate the block to achieve high fitness.

Figure 6.2: Fitness on the BD task (moving average over a 1000-tournament moving window) for evolution from a random (unevolved) population.

Figure 6.3: Fitness on the BD task (moving average over a 1000-tournament moving window) for evolution from a naive (evolved in 3D RC) population.

CHAPTER 6. OBJECT MANIPULATION IN 3D VIRTUAL CREATURES 0 50 100 150 200 250 300 Displacement Achie v ed Evolv ed.from.r andom Evolv ed.from.naiv e Naiv e.perf ormance

Figure 6.4: Comparison of the best individuals from the naive population, and from populations evolved from the random (unevolved) and naive-evolved populations.

H1: The hybrid architecture is sufficient to achieve feedback control that allows agents to successfully manipulate and guide an external object

Visualisations of agent behaviour can be seen at https://youtu.be/gZaUvXcdMK8, and figure 6.5 provides a static view of an agent. The zoopraxiscopic figures, presented in the style of Eadweard Muybridge (Muybridge, 1887), show a time- series of snapshots that illustrate how agents approach the block from a distance (figure 6.6), and manipulate the block in their world (figure 6.7). In order to gauge the importance of the architectural components, agents were observed under impoverished sensory conditions. The behavioural results of sensory deafferentation are presented in figures 6.8 and 6.9. Figure 6.8 shows the planar trajectory followed by agents approaching the block from a distant point under various deafferentation conditions; figure 6.9 shows the response of agents to the same sensory culling in a closer, control scenario.

Figure 6.8 shows an agent that begins at (5,5) and attempts to reach the block at (20,20). The unaltered agent’s trajectory is shown in the top left; this agent tends to overshoot its target and then correct by rotating, as the two loops in the

Figure 6.5: Visualisation of a single agent in the block-displacement world. Agent is displaying a low, heavy gait suitable for block pushing.

Figure 6.6: Approach gait. The agent is moving toward the block from a distance. All limbs are contributing to the movement.

Figure 6.7: Control gait. The agent is pushing forward with its ‘back’ limbs, maintaining the block between its forelimbs.

CHAPTER 6. OBJECT MANIPULATION IN 3D VIRTUAL CREATURES

Figure 6.8: Agent-block trajectories of best agent from best overall trained population under various deafferent sensory treatments; approach task.

Figure 6.9: Agent-block trajectories of best agent from best overall trained population under various deafferent sensory treatments; control task.

path record. All sensors have some effect on this behaviour although sensor 1 is by far the most pronounced difference in a single cut. In complete deafferentation (bottom left) the agent moves randomly. In contrast, in figure 6.9, the agent begins at (18,20), adjacent to the block at (20,20). The unaltered agent pushes the block in a tight circle to maximise fitness (top left). Sensory deafference does not have a catastrophic effect as in the approach task; all single cuts still maintain block movement although the trajectory is less efficient, as does the dual cut of sensors 1 and 2. Only by cutting sensors 3 and 4 or complete deafferentation was failure to displace the block at all observed. The figures together demonstrate that information from the agents’ sensors are being used together to generate reliable gaits for distance approach and block control.

H2: There is some overlap between the 3D RC task and the BD task due to the requirement for speedy and accurate movement in both

environments.

A non-parametric correlation analysis was undertaken between the species’ relative ranks for mean fitness during the final 1000 tournaments of the 250k- tournament 3D RC pre-evolutionary runs and the mean score on the BD task. Figure 6.10 presents this correlation graphically for both two minute and ten minute evaluation times. In the 10m trial we found a statistically significant although weak correlation (ρ = 0.38; H0p < 0.05). The correlation between 3D RC and BD performance in the 2m BD trial is much stronger (ρ = 0.51; H0 p < 0.05).

Figure 6.10: Across-species correlation comparing 3D RC performance and BD performance. Outcomes across the two tasks are more correlated when evaluation time is shorter (ρ = 0.51), indicating that movement speed is a factor in success in the block task and shared between the two problems. However, a strong gait is required to push the block and this is not selected for in the 3D RC task, hence the lesser correlation in the 10m task (ρ = 0.38).

H3: Species evolved in the 3D RC task show increased performance after evolution in the BD environment, and the final perfomance is not

significantly different to species evolved from random in the BD environment.

Table 6.1 shows the naive BD score (column 4) compared (column 6) to the evolved score (column 5) for all 20 species. There is a clear improvement in all cases over the 25k tournament evolutionary run: the mean fitness over all naive populations was 37.29, compared to 124.16 in the evolved set (H0 p <

CHAPTER 6. OBJECT MANIPULATION IN 3D VIRTUAL CREATURES

1010). Figure 6.2 shows progress of runs beginning from random genotypes over evolutionary time (100k tournaments in 1k tournament averages). Figure 6.3 shows the same view of populations beginning from naive genotypes, over 25k tournaments. Both treatments show a levelling off of fitness and there is no significant difference between the best individual evaluation performance of the two starting conditions across all replicates, demonstrated by the proximity of the two treatments’ box plots in figure 6.4. A correlation was found between naive score and evolved scores across the 20 species (ρ = 0.59; H0 p < 0.01) but no correlation between the naive scores and the magnitude of the change in fitness

(ρ = 0.26; H0 p > 0.1). Figure 6.11 demonstrates these relationships:δ-fitness is

uncorrelated with naive fitness. It was also noted that observed behaviours of the two types of agent were qualitatively indistinguishable, implying that there are potentially few avenues to solve the task, but crucially also showing that the employed decomposition of the task does not hamper the search for this behaviour.

Run 3D RC Nve2 Nve10 Evl10 δ-f

1 306.78 6.50 38.17 123.07 84.90 2 254.81 4.25 24.42 65.84 41.43 3 283.30 6.22 31.10 130.69 99.58 4 281.91 8.63 51.09 144.80 93.72 5 328.82 9.58 56.36 133.40 77.04 6 271.77 10.29 58.90 147.82 88.92 7 343.80 15.10 42.98 154.68 111.70 8 291.27 14.97 64.90 160.82 95.92 9 284.95 7.58 29.23 110.74 81.52 10 267.79 9.87 23.66 105.04 81.38 11 275.82 10.09 49.98 205.49 155.52 12 251.57 5.21 36.72 103.01 66.29 13 280.65 8.93 41.19 144.42 103.23 14 310.35 4.91 19.16 60.29 41.13 15 281.82 8.71 41.98 69.98 28.00 16 273.59 5.52 28.29 130.22 101.93 71 277.03 8.53 37.01 164.83 127.82 18 278.92 6.99 35.11 87.02 51.92 19 280.14 8.54 26.86 155.60 128.74 20 250.72 4.75 8.71 85.42 76.71

Table 6.1: Results table showing all 20 species’ performance in 3D RC, naive block (2m and 10m evaluation) and evolved block tasks, and difference between naive and evolved. There is a relationship between prior and post performance, but not between prior performance andδ-fitness.

Figure 6.11: Correlation of evolved fitness with naive fitness (ρ = 0.59), and delta fitness with naive fitness (ρ = 0.26).

In document Simultaneous incremental neuroevolution of motor control, navigation and object manipulation in 3D virtual creatures (Page 98-105)