(In the proof of concept)
7 Conclusions
7.3 How to go on with Eye-Tracking Research for Interaction
As already mentioned in the introduction the knowledge for building gaze-aware interfaces spreads over many disciplines. The situation of a researcher or research team from one discipline is difficult because of missing knowledge from the other disciplines. This causes a waste of time by repeating experiments which were already done or by becoming an expert of another discipline first. Consequently, an interdisciplinary team can boost eye- tracking research for interaction. It is clear that interaction research needs a specialist from the field of HCI. Beside that HCI researcher, a psychologist would help a lot. Psychology looks back on several decades of eye- tracking research and there are many results which could contribute to the goal. Psychology knows about attention and recognition. It studied the reading process in detail and could answer the questions on motor learning or the memorising of gaze gestures. Medicine too uses eye tracking for many years now, mostly for diagnosis and physical rehabilitation. Medicine knows about fatigue of eye muscles and how high the intensity of infrared illumination used for the eye tracker may be. For the development of an eye tracker with properties a commercial eye tracker does not have, it requires a computer scientist with skills in image processing. Finally, it helps to have experts in the team for special applications. In the case of gaze-aware advertisement this would be a communication scientist and for gaze-assisted e-learning an expert in the field of pedagogy.
The research on eye tracking for interaction increased steadily over the last years and created ideas which were tested in the laboratory. For the goal of a product, which could sell in the mass market, these ideas should be tested outside the laboratory in a real environment. For example, the gaze positioning with the touch-sensitive mouse presented in chapter 4.3 has the potential to be made into a product. It needs an eye tracker which allows free head movement (and not an eye tracker with a chin rest) to build a prototype for a user study in a real working environment. For such a user study the participants should work at least a week with the system and not only half an hour as is typical for a study in a laboratory environment. As a user study requires a certain number of participants to produce statistically significant results, the availability of more than one prototype will bring quicker results by conducting the study in parallel instead of sequentially. If the gaze-positioning system should sell as a medical product to cure repetitive strain inquiry with a subsidy from health insurance it would be of advantage to have the medical department involved. Despite falling prices eye trackers allowing free head movement are still costly. The patents for the eye tracking methods allowing free head movement may concern people or institutions that finance such research.
For the gaze gestures presented in chapter 5 it is not clear whether they have the potential for a mass market product, for example as a remote control for a media centre, or whether they are only applicable for a niche product. Nevertheless, the situation is similar to gaze-assisted pointing and it needs a prototype for tests outside the laboratory.
The usage of eye gaze as context information is still an open field and definitely good for further scientific publications. There are many open questions to answer. It starts with practical questions like what do we see on a web page and what we do not see? What are the implications for a good design of web pages? Further questions are to which extent and how reliable a computer can detect the user’s activity or intention from the gaze. How should a system react when knowing the user’s activity or intention? Is it useful or even necessary to combine
the information from gaze with other context information? Is there a general concept to utilize eye gaze to build smart devices or does it depend on specific applications?
Beside the technological questions there are also social or human related questions. Does eye-tracking technology help us to keep our privacy, as it is the intention of the research presented in section 5.4.5, or does it threaten our privacy? An eye tracker can reveal that an employee did not concentrate on his or her work and looked out of the window most of the time. The other way round an eye tracker can detect that we worked to hard for too long and encourage us to have a break and protect our health and ensure the quality of our work. An eye tracker can also find out if a person’s degree of literacy is low because of low reading speed and because of reading long sentences and difficult words twice. Again, this raises the question in what way we want to use the capabilities of eye-tracking technology.
It is also possible to ask application specific questions. Does a gaze-aware e-learning application increase the success or motivation of the learner and does it matter whether the learner is an adult or a child? Does a gaze- assisted video conference system, which does not only present the document to talk about, but also where on the document the gaze of the other is, support the communication? Can gaze-aware advertisement increase the sales? Finally, eye-tracking technology without the aspect of interaction needs further research, especially on eye trackers for outdoor use.
The list of interesting research questions given above is long and far from being complete. It is not clear which question is most urgent to answer or most promising for results. Therefore, it is up to the researcher which question to pick.
7.4
The Future of Gaze-Aware Systems
The integration of eye tracking with the computer is technically possible. The question remains if and when they will become standard. Whether this will happen depends mainly on the availability of useful concepts for user interaction as they are discussed in this thesis. Several factors influence the future of gaze-aware systems, the availability, the demand, and the costs. These factors influence each other; if the demand grows it will cause more availability at lower costs. In addition, sinking costs can increase the demand. At the moment eye-tracker systems are available but they are not prepared for working as an input device except for accessibility systems which do not bring benefit to the regular users. Most eye trackers existing today serve the purpose of recording and analyzing gaze data and the demand in this market is low compared to the demand for input devices like mouse devices or webcams. Consequently, the prices for eye trackers are still high, again compared to mouse devices or webcams.
The future of gaze-aware systems will depend on an application. The graphical user interface was the application that pushed the mouse device from a special input device for CAD (computer aided design) engineers to an input device for the masses. Although it is possible to direct a graphical user interface solely with the keyboard and without a mouse, in many cases even more efficiently, most people are not able to operate such a system if the mouse is missing or not working. This is the reason for the big success of the mouse device. There is no
comparable application for gaze-aware systems on the horizon that everybody wants to have and which does not work well without eye tracker as it was the case for graphical user interface and mouse.
An application that could have the potential to create a mass market for eye-tracking technologies is a computer game. Computer games are a growing market and special input devices for game stations fill the shops. An eye tracker for computer games could come along as a head-mounted device, a headset with earphones and microphone, which are commonly used for gaming already, and with two extra cameras. One camera is mounted near the microphone and focuses on one eye and the other camera next to one earphone with the same view as the eye. The camera at the microphone tracks the eye and because it has a rigid connection with the head, head movements do not influence it. The camera at the earphone sees the display and can calculate the head position from detecting the corners of the display or from matching the known display content to the camera picture. The hardware for such an eye tracker consists of a headset, two webcams and perhaps an infrared LED and all together is available for less than 100 Euro or Dollars. The main costs are the software development and the effort to make it a product. These costs are small per piece if produced in high quantities. The reason why an eye tracker could be successful in the market as input device for computer games lies in the speed. Eye-tracking interaction is fast if the targets are not too small and if an extra input modality is used. For a typical shooting game the targets are big enough in size and the extra input modality is the fire button. While a saving of 300 milliseconds for a pointing operation does not make much difference for a spreadsheet application or a word processor, it makes a big difference for an action game. The excitement and finally the level of adrenalin in the body are directly related to the speed of the game. The experiences with a gaze-aware game “Moorfisch” (inspired from the game “Moorhuhn”) written by students as an exercise support this assessment. The task of the game is to shoot different kinds of fishes with the gaze and every shot fish adds points to the score. To make the game more interesting a special type of fish (could be a diver too) results in minus points. Picking up a pearl from a seashell that only opens occasionally gives a bonus. Although the game and the graphics is quite simple compared to the games commercially available, there was big demand for it when presented at the open day of the university.
Figure 81: Screenshots of the Moorfisch game written by students as an exercise based on an interface developed in the course of this thesis.
Research on eye tracker input in first person shooter games [Isokoski, Martin 2006] could not (yet) show an increase in performance compared to classical mouse input. This result is in contradiction to the findings for the
“hardware button” of Ware and Mikaelian [Ware, Mikaelian 1987] and the user study presented in 4.4.2 where gaze positioning together with key input were significantly faster than a classical mouse. The master thesis of Jönsson [Jönsson 2005] reports high acceptance for eye trackers as input for computer games and this is in accord with the observations made with the “Moorfisch” game. As the computer game industry always searches for new ideas it is only a question of time until cheap eye trackers for gaming will be in the shops. The availability of cheap eye trackers will lead to the development of further applications for such an eye tracker. Attention sensors for mobile devices are a further possibility to introduce eye tracking to the mass market. The costs for an attention sensor are negligible and the manufacturers of mobile devices always look for new features to have an advantage in the highly competitive market. Such an attention sensor in a mobile video viewer can provide the functionality of pausing the video when not looking at it. It can also provide a power-saving function by switching off the display when nobody is looking at it. The careful use of energy is very important for mobile devices as the capacity of the batteries is limited. A laptop typically switches off the display after a certain time without key or mouse input. This concept does not work when watching a video as there is no input from the mouse or keyboard. The problem is also well known from mobile MP3 players which try to solve the problem with a HOLD switch. While it seems to be nearly impossible to detect whether somebody listens to audio content an eye tracker can detect whether somebody is watching video content.
A good chance for eye tracking in the smaller high-end market is the trend to large displays. Interaction with large displays or multiple monitor setups by a mouse has problems. One problem is that people cannot find the mouse pointer on the large display area; another problem is how to adjust the control-gain ratio for the mouse. A high gain causes problems for the precision of the mouse movement as explained by Fitts’ law while a low gain will lead to mouse movements which exceed the range of the hand. Concepts like focus activation by gaze (as suggested by Vertegaal) or MAGIC pointing (as suggested by Zhai) and preferably MAGIC touch (as suggested here) can help. As large displays are still expensive, the cost for the eye-tracking device does not contribute to the total costs too much. The MAGIC touch principle also saves many hand movements and for this reason helps people who suffer from RSI (repetitive strain injuries). As medical treatment is expensive an eye tracker and a touch-sensitive mouse can be the cheaper alternative.
All these visions of gaze-aware systems could become reality within the next years and some probably will. Prophecies for longer periods are speculations. Nevertheless, it is clear that the evolution of human-computer interfaces will lead to systems that are more ‘human’ and not to systems where the humans have to act like computers. As the eye gaze is very important for the human-human interaction, it will definitely be very important for the development of future human-computer interfaces.