session, a real-time, zero-state classifier was used. A 3 second window was used for classification and was evaluated every 200 ms. The multi-class LDA calculated scores for each class, as explained in (Kapeller et al., 2013). Those scores were normalised using a Softmax transform (Volosyak, 2011). A threshold based on the 95% confidence interval was used to accept or reject the classification.
4.2.3
Eye Tracking Setup
The HMD was factory fit with a ViewPoint Eye Tracker by Arrington Research1 in the
right eye. This tracker is specifically made for use in HMDs. The initial calibration was done following the procedures described in (Borland et al., 2013). Control of the robot gestures was performed by dwelling on the same boxes as the SSVEP condition, although since there was no need for them to be blinking in this condition, they were displayed statically. A dwell time of 1 second was chosen for the trigger to limit the number of false positives, that is, the Midas’ touch issue (Penkar et al., 2012). A short in-world training was developed for this, optimising participant performance on this particular control setup. A cone was temporarily displayed on screen, pointing in the detected gaze direction. Participants were advised on how they could slightly adjust the location of their gaze, in order to trigger the gestures more accurately. The cone originated in front of the right eye. It was helpful for learning to use the tracker, but was also uncomfortable to view and blocked participant’s vision; therefore, it was removed prior to starting the experiment.
4.3
Scenario and Procedure
The basic scenario was a social, robotic telepresence setup. Participants saw the remote physical setting via a stereo HMD, where the stereo-paired images were produced by
two cameras mounted just above the eyes of the robot, which was physically located just outside of the laboratory where the participants were seated. A mirror was placed on a table directly in front of the robot, such that the participants were able to see the majority of the robot body reflected, a setup that has been shown to be effective in inducing BOI (González-Franco et al., 2010; Preston et al., 2015). The participant interacted with the experimenter in the remote space.
The same procedure was followed for both SSVEP and eye-tracking conditions, with the exception of the initial, condition-dependent setup. Participants were familiarised with the equipment and the robot. They were shown the overlay interface, consisting of four boxes seen in front of the remote scene, for triggering the gestures, and were informed of each gesture. The calibration of the system per condition was then undertaken. In the SSVEP condition, participants were instructed about how to control the SSVEP interface. Although four subjects reported having experience with EEG, none was ex- perienced with SSVEP. The EEG cap was mounted and the HMD donned. The SSVEP training was performed as outlined in Section 4.2.2. The HMD was removed during offline generation of the classifier. Up to three trainings were performed and the classi- fier with the lowest error was used; if the participant achieved a classifier error of less than 15% in any training session, that classifier was used without further training. Once the classifier was trained, the HMD was placed again for the experimental session. In the ET condition, initial calibration was performed as outlined in (Borland et al., 2013). Training with the full interface was performed with a cone, as previously described. At this point, the main task started for both conditions, and the procedure was the same for both. Video streaming from the two robot eye-positioned cameras was started, along with the sound connection between the spaces. Participants were encouraged to move their heads and look around and were greeted by the experimenter who was now in the external space with the robot.
4.3 Scenario and Procedure
Figure 4.2: The boxes that overlay the video stream triggered the corresponding robot actions: when the left box was selected, the robot pointed to the left; when the upper box was selected, the robot pointed to the front; with the right box, the robot waved hello, and the lower box triggered the ‘I don’t know’ gesture. The idle robot state was triggered by looking anywhere away from the four boxes.
Four robot actions were created specifically for the experiment, as illustrated in Figure
4.2. These actions were chosen for their simplicity and for their appropriateness as ges- tures in a social setting. Now embodied in the robot, participants were again instructed informally about how to trigger each of the gestures. Then the skill test was started, where participants were asked to trigger each of the four actions four times. The ex- perimental setup is shown in Figure 4.3. The participants’ performance was measured during this part of the experiment. After the Skill Test, the experimenter left to return to the laboratory. The session was then ended, and the equipment was removed from the participant. The participant then answered the post-session questionnaire.
Figure 4.3: View from the robot eyes. Boxes overlain on the real life streaming that allow the robot control in both conditions. In this image we can see the mirror facing the robot, the control boxes that the participant used to trigger the gestures, and the experimenter
that interacted with the robot.