Study Architecture and Data Collection

4.3 Study Design

4.3.3 Study Architecture and Data Collection

We favoured tablets for the data collection over other touch devices because of their reasonable size, prevalence, and compatibility with eye trackers. We used a Microsoft Surface Pro 3 (2160×1440 pixels resolution). We chose the Tobii X2-60 eye tracker (60 Hz), designed for studies on smaller devices. We selected the Tobii’s rack after running a pilot study to validate the rack would not impair the participant’s interaction (cf. Section 4.2). We thought of including the Tobii rack in our apparatus for its compatibility with the eye tracker and its design: two guide bars prevent the users from placing their hands above the eye tracker, which prevents hand occlusion. Figure 4.8 shows how the stand and the eye tracker were set for the data collection. Participants sat on a chair to interact with the tablet.

4. Correlation between Gaze and Tap

Figure 4.8: Eye tracker and stand configuration.

To keep the naturalness of the interaction, we did not ask the participants to limit their hand actions to the dominant hand: they were free to interact the way they liked. The data collection consisted of retrieving the following information: touch input, on- screen gaze position, and tapped object characteristics. Touch data was collected through different steps. Details are provided in Section 3.2. The resulting files of the hand gestures contained the following information:

• timestamp,

• type of gesture (TOUCH_EVENT, DRAG_EVENT or ZOOM_EVENT), • normalised position of the event,

• optional parameter, only set for ZOOM_EVENT to indicate the zoom factor.

We wrote an application (in C#_{) that retrieved gaze data samples from the eye tracker,}

and logged them into a file, which contained the following:

• timestamp,

• for each eye: estimated point of gaze in normalised screen coordinates

• for each eye: validity code (Tobii specifications to inform about the reliability of the tracking).

This application also ran the eye tracker calibration before each task and the drift evalua- tion afterwards. Fixations were computed post-hoc with OGAMA2 on a spatial detection threshold of 22 pixels (∼ 0.56◦ of visual angle). For this data collection, we implemented a web browser (C#_{WinForms application, providing a WebBrowser object that ran the}

Internet Explorer 11’s engine). We decided to develop our own browser in order to easily

4. Correlation between Gaze and Tap

get feedback from it and to offer a basic and sleek user interface for all participants. The browser had a dimension of 1440×960 pixels3_{, with a viewport of 1440×914 pixels, topped}

by a navigation bar (illustrated by Figure 4.9). Participants evaluated the browser after the study and scored it 3.6±0.9 on average on a 5-point Likert scale. We report here few comments given by the participants regarding the browser: “Difficulty in getting the keyboard out. Not very sensitive to touch” (participant 5), “Generally worked well but it sometimes dropped the keyboard when I was typing to select places to type. The search bar was also difficult to use - small to touch easily and it was hard to highlight specific text.” (participant 6), “I couldn’t tell the difference to other browser.” (participant 10).

Figure 4.9: Browser’s navigation bar (truncated).

Participants evaluated the overall setup after performing the tasks, and gave an average of 3.6±0.8 on a 5-point Likert scale. Here are some of the participants comments regarding the overall setup: “Position of hands not very comfortable. When I use tablets I keep them more vertical than horizontal.” (participant 5), “Most things were intuitive but with the exceptions from the previous section [about the browser]”, “It’s probably not as natural as holding the device, but the rig did not substantially affect my experience.” (participant 10), “Simple, unobtrusive and straightforward to use.” (participant 16). We categorised the tapped objects from their 3 distinct natures: HTML, browser or keyboard4 elements. HTML and browser targets were tracked via the browser (using JavaScript injected code when the OnWebBrowserDoClick callback method of the WebBrowser object was called, as shown by the code preview in Appendix A, Figure A.2), the following relevant elements were written into a log file containing:

• timestamp, • type of event:

AddrTextBoxTouched when the browser’s address bar was tapped on, BackButton when the browser’s back button was tapped on,

click when a webpage HTML element was tapped on, GoButton when the browser’s go button was tapped on,

This is the dimension of the full screen not in the high DPI mode on the tablet.

4_{In the thesis, keyboard should be understood at the virtual keyboard displayed on the tablet’s}

4. Correlation between Gaze and Tap

• target’s size,

• target’s relative position in the viewport,

• for HTML targets: the tag name (for instance A, INPUT, etc),

Figure 4.10: System architecture.

We made a specific application (in C#_{) to track the keyboard targets, with the help}

of the native Windows library Microsoft Keyboard Input API. The resulting log files contained:

• timestamp, • key code.

In a post-hoc step, we then aggregated these different log files with the taps log files into a single file, based on the timestamps. This final file communicated the following information:

• timestamp, • participant ID, • task,

4. Correlation between Gaze and Tap

• tapped target type (html, KB or browser),

• tapped target name (in the case of type html it was the HTML element tag name (i.e. A or INPUT), in the case of the type KB it was the key code (i.e. Return, Back or OemPeriod), in the case of browser it was the keyword indicated which browser object was touched (i.e. AddrTextBoxTouched)),

• in the case of type html or browser: the tapped target abscissa; in the case of type KB: the literal value of the key (i.e. “Return”, “Back” or “.”),

• in the case of type html or browser: the tapped target ordinate, • in the case of type html or browser: the tapped target width, • in the case of type html or browser: the tapped target height, • the tap abscissa,

• the tap ordinate.

The flow chart in Figure 4.10 summarises the different steps aforementioned.

In document Description and application of the correlation between gaze and hand for the different hand events occurring during interaction with tablets (Page 71-75)