Technical setup - Technical development - Dual Eye-Tracking Methods for the Study of Remote Col

4.2 Technical development

4.2.2 Technical setup

We implemented and tested four different setups of different complexity. The first two solutions use two computers and rely on the manufacturer software, Clearview, to operate the eye-trackers and on a third-party shared application to support the collaboration. While they are simpler to implement at first hand, the synchronization of data can be hard to accomplish and is generally not very accurate. The two other setups use a third computer which runs a custom shared application and which is in charge of the logging of all data. While it is more complex to implement, it offers much greater flexibility in the use and logging of the gaze data and can reach high level of synchronization accuracy.

Setup 1: Two computers and manual eye-tracker operation

This solution, which is presented schematically on figure 4.1, is certainly the most straight- forward way to use two eye-trackers in parallel. It consists simply in running manually each eye-tracker using the ClearView software on both computers by using a custom stimulus setting. Then, the third-party shared application, which constitutes the stimulus, is run on each computer. Such a setup will produce three unsynchronized sets of data which have to be synchronized afterwards. First, we have the two sets of data, i.e. gaze data, input events, and screen recording, recorded by the two eye-trackers through the ClearView software. Secondly, we have the logs of actions and stimulus changes produced by the shared application. We assume that the shared application has some internal synchronization mechanism which makes it produce a unique log with a unique time base that contains events happening on both computers. This latter log provides the only way to relate the time of our different datasets. The general idea for synchronizing these streams of data is to find correspondences between events logged in the shared application logs with input events logged by the eye-trackers on each computer. In this way, it is possible to establish a correspondence between the timestamps of each eye-tracker’s logs and the timestamps of the shared application logs. Thus, all events are taken back to the time base of the shared application. The crucial point is that the shared application logs some events which correspond directly to input events for one of the two subjects, so that we can find in the log of one subject’s eye-tracker the input events corresponding to the shared application event. For example, in one experiment we conducted (see section 5.2.2), the shared application was a tool to build diagrams consisting of labeled boxes and links. In this case, the application logged the creation of new boxes with the identifier of the subject who created it. Creations were generally accomplished by making a double-click which allowed us to match the double-click events appearing in the subject’s log with the box creation events of the application. However, we required several instances of such cases because there were other double-click events that did not correspond to box creations. Thus, the solution consisted in having several such events (actually, we also matched other types of events) and in finding the best time offset so that they all match with their corresponding input events on the subject’s log. This step is accomplished for both subjects separately and we ended up with two time offsets, one for the difference between one

Figure 4.1: Dual eye-tracking technical setup 1. Differences of gray levels indicates that logs do not have the same time basis

subject and the shared application and one for the difference between the second subject and the shared application. With this in hand, the times of all data streams could all be taken back to a single time base.

There is another possibility of synchronization which is relatively more difficult to perform but that may save the situation when the shared application does not offer the required action log. The approach consists in detecting shared events that happen on both computers at the same time in the screen recordings of both subjects. For example, in the example described above, when a box creation occurs, it appears on both screens at the same time. Thus, it could have been possible to detect the times of appearance of boxes in the screen recordings of each subject. Thus, we could find the time offset between the eye-tracker logs of each subject. Note that while this may seem tricky because it requires computer vision, this may be often relatively easy because as we are dealing with very regular images and the comparison can be accomplished by doing basic image matching. We actually used such a technique to perform a spatial synchronization of the gaze data coordinates with the shared application coordinates. Indeed, in this case, the workspace in which subjects could create boxes and links was bigger than a single screen and they had the possibility to scroll in this space. However, the application did not log scroll events which made that we had no information of what was the current view of the workspace for one subject at a given time. Thereby, we used the video to detect changes of viewport for each subject.

Chapter 4. Dual eye-tracking methodology

data which can be used even if we do not have simultaneous events . This technique requires however that the subjects are able to speak to each other and that either they are in the same room so that both eye-trackers record both voices, or that a third application records both voices. In such a situation, we could use the audio recordings included in the screen recordings of the eye-trackers and find the best time offset between them so that they get aligned. Indeed, if both voices are present on both recordings, even if each recording has a different relative amplitude between the two voices, it is possible to align the audio streams to make them match at best. Similarly, if we have a third recording with both voices in it, it could be match separately with the audio streams of each eye-tracker because they have portions of sound in common. In the example described above, we had such a third audio stream that came from an audio conferencing software used for communication between the two collaborators. We used this technique to synchronize this third audio stream with the eye-tracker’s time so that we ended up with our external audio streams to be synchronized with all other data.

The quality of synchronization of this method depends greatly on the logging accuracy of the third party application. Indeed, it requires that the logging of shared actions is done in an accurate way. Qualitative analyses also revealed that a time compression or expansion may exists between the two computers which means that the time passes faster on one computer. This requires to make a more subtle correction than a simple offset such as to compute an offset and a compression factor.

Setup 2: Two computers with automatic eye-tracker operation

In order to avoid the need of these post-synchronization steps, we implemented a slightly different solution which relies on the use of the high-level eye-tracker programming API. The general setup is similar to the previous solution with two separated computers, each one running the eye-tracker software separately. The only difference is that we use the programming API to launch the recording process in ClearView on both computers at the same time. This requires the development of a small application that could run on any of the two computers and that is responsible for sending a triggering message to both computers at the same time. The rationale is that if both eye-tracker softwares are started at the same time, the beginning of recordings of both eye-trackers could then be considered as representing the same time and the offset between the two eye-tracker times could be directly computed from these values. This setup is shown on figure 4.2.

It remains that it is still necessary to synchronize the shared application logs with the two eye-tracker logs. Different cases are possible. First, the shared application can be closely linked with the application responsible of triggering the eye-tracker recording process. This is in principle reserved to situations in which we use a custom, or open-source, shared application so that it can easily integrate the module responsible for starting the eye-tracker and can register the time at which the eye-tracking process is started. Second, if we use a third-party closed-source application, we have to find a way to inform the application about the starting time of the eye-tracking process.

Figure 4.2: Dual eye-tracking technical setup 2. Differences of gray levels indicates that logs do not have the same time basis

This setup is slightly better than the previous one because it doesn’t require any post-processing for synchronizing the various data streams. However, there are potential drawbacks with this technique. First, it requires some programming of application which has to start the eye- tracking process. While this is not a huge task, it still requires some competencies and care to be performed correctly. This is however not a real drawback compared to the previous method as the first step requires also programming development for the post-synchronization part. Second, this method does not work with any shared application because, as we explained above, it is necessary to be able to make the application aware of the precise eye-tracking starting time. Third and more importantly, this method relies on the assumption that the triggering of the recording process is fast and accurate and that will happen at the same speed on both computers. As this works through the ClearView software, there is a great chance that it is not very precise and that some variable time lags may occur between the triggering message and the actual start of recording. Finally, this solution doesn’t take into account the potential presence of a time compression or expansion between the two computers.

Setup 3: Three computers with synchronous data logging

Considering the difficulty of the previous methods as well as their imprecision in the resulting synchronization, we developed another technique which is based on the low-level eye-tracker API. This solution consists in writing an application that is entirely responsible of the eye-

Chapter 4. Dual eye-tracking methodology

tracking process for both computers. Of course, this also implies, as in the previous solution, to use a custom, or at least open-source, shared application that allows for all data to be logged in the same way.

As shown on figure 4.3, this setup uses three computers, one for each eye-tracker and one third server computer which runs the shared application and which is in charge of all logging. The two client computers run only lightweight applications which send user actions to the server and receive changes to apply from it. The server application is the core of the system. It must integrate calls to the eye-tracker library to start the eye-tracking process and to collect the gaze data. In short, the eye-tracker API provides a mechanism which consists in defining a callback function which is called for each new gaze data. The parameters of the callback function contain all the raw gaze information as they are described in section 3.1. In order to be able to communicate to both eye-trackers from the same computer, we duplicated the DLL of the eye-tracker API and we called them separately within the server application. The synchronization of the data was done by putting new time-stamps to the gaze data as they were collected by the application.

The problem of this technique is that it relies on the re-time-stamping of the data by the server. This may cause problems for two reasons. First, the server does not necessarily have an accurate timer and secondly, it introduces a latency (due to transmission) between the moment the data is collected and the moment it is time-stamped by the server.

Setup 4: Three computers with unique high-precision timer

We have finally been able to correct the issues of the previous setup by using a special feature of the eye-tracker API. Indeed, the API provides a way to choose how the data are time-stamped. More specifically, it allows one to use the timer of another computer. By default, it uses a timer from the computer running the TET server (in our case, the two client computers). However, it is possible to tell the API to use the timer of the computer which makes the calls to start the eye-tracker (in our case, the server). Hence, it is possible to instruct both eye-tracker to use the timer of the server computer which makes them using the same timer and hence timestamping the data with the same time base. Note that such a use of this functionality, i.e. to have two eye-trackers using the same timer, is not planned by the manufacturer and they explicitly told us that they could not guarantee that it will work (personal communication with Tobii). However, we made several tests which show that this actually works as expected. Indeed, even when the eye-trackers are started at different times, both gaze data streams have comparable timestamps. The general technical setup is similar to setup 3 (see figure 4.3). This method also has the great advantage that the timer used by the eye-trackers to timestamp the gaze data can also be accessed through the eye-tracker API. Hence, it is possible to use it to time-stamp all other data produced by the shared application. The only data that required special artifact is the audio data. Indeed, this latter is not logged but simply recorded in an audio file. Thus, it is not possible to timestamp it as the time goes. Our solution consisted

Figure 4.3: Dual eye-tracking technical setups 3 and 4

in using some functions which allow us to know the current frame number of the audio file currently recorded. Thanks to such a function, we periodically logged correspondences between the current time provided by the eye-tracker API and the current recorded frame of the audio file. It was then very easy to compute the offset and scaling factor between these two values and the result revealed to be very precise with a near-perfect linear relationship between the two streams.

This solution appeared clearly as being the best we can achieve with this hardware. It reaches perfect accuracy, i.e. the gaze data of both eye-trackers are timestamped with the same time

Chapter 4. Dual eye-tracking methodology

base even if they are started at different moments. The only aspect which needs to be handled with care is the logging of the subjects’ action data. Indeed, as it is the server computer which is in charge of the logging, all actions done on a client computer must first be sent to the server before being logged. This could introduce a little lag in the actions log because of the duration of message transmission. Finally, this method also relies on the mechanism used by the eye-tracker server to use the timer of a third computer and we don’t know how well it is accomplished. This latter aspect is the only potential source of synchronization errors.

4.2.3 Discussion

Dual eye-tracking studies imply several technical issues mainly related to the synchronization of the various data streams that are recorded. In particular, the main challenge is the accurate synchronization of the gaze data produced by two separate Tobii 1750 eye-trackers. We have presented different possible technical setups allowing us to achieve this goal with variable accuracy level. The final setup we proposed achieves this goal in a very effective way. It allows us to record all data, i.e. both subject’s gazes, subject’s actions, stimulus changes and audio data in a synchronous manner through the use of a single timer.

Of course, these considerations are very dependent on the type of eye-trackers and the mech- anisms they offer for their operations. Hence, the solutions presented here concern mainly the Tobii 1750. However, they also give insights on how to solve these issues in more general situations.

In document Dual Eye-Tracking Methods for the Study of Remote Collaborative Problem Solving (Page 92-98)