This paper proposes an image descriptor, Gabor Directional Binary Pattern (GDBP), for robust gaze estimation. In GDBP, Gabor magnitude information is extracted firstly from a cropped subimage. The local directional derivations are then utilized to encode the binary patterns in the given orientations. As an image descriptor, GDBP can suppress noises and robustness to illumination variations. Meanwhile, the encoding pattern can emphasize boundary. We use the GDBP features of eye regions and adopt the Support Vector Regression (SVR) to approximate the gaze mapping function, which is then used to predict the gaze direction with respect to the camera coordinate system. In the person-independent experiments, our dataset includes 4089 samples of 11 persons. Experimental results show that the gaze estimation can achieve an accuracy of less than 2 ◦ by using the proposed GDBP and SVR.
This paper tackles RGBD based gaze estimation with Convo- lutional Neural Networks (CNNs). Specifically, we propose to decompose gaze point estimation into eyeball pose, head pose, and 3D eye position estimation. Compared with RGB image-based gaze tracking, having depth modality helps to facilitate head pose estimation and 3D eye position estima- tion. The captured depth image, however, usually contains noise and black holes which noticeably hamper gaze tracking. Thus we propose a CNN-based multi-task learning frame- work to simultaneously refine depth images and predict gaze points. We utilize a generator network for depth image gen- eration with a Generative Neural Network (GAN), where the generator network is partially shared by both the gaze track- ing network and GAN-based depth synthesizing. By optimiz- ing the whole network simultaneously, depth image synthesis improves gaze point estimation and vice versa. Since the only existing RGBD dataset (EYEDIAP) is too small, we build a large-scale RGBD gaze tracking dataset for performance evaluation. As far as we know, it is the largest RGBD gaze dataset in terms of the number of participants. Comprehen- sive experiments demonstrate that our method outperforms existing methods by a large margin on both our dataset and the EYEDIAP dataset.
A system based on a dual-state model for eye tracking is proposed in . Both of these methods require manual initialization of eye location. Witzner et al.  model the iris as an ellipse, but the ellipse is locally fitted to the image through an EM and RANSAC optimization process. A system based on feature detection using one camera is suggested by Smith et al. . Noureddin et al.  suggest a two-camera solution where a fixed wide- angled camera with rotating mirror directs the orienta- tion of the narrow-angled camera. A practical gaze point detecting system with dual cameras is also pro- posed by Park et al. . They employ IR-LED illumina- tors for wide and narrow view camera to overcome the problem of specula reflection on glasses. They attain the 2.89 cm of RMS error between the computed and the real positions. A non-intrusive eye gaze estimation sys- tem is proposed by Yoo et al. . Their system allows ample head movements and yields quite accurate results using cross-ratio under large head motion. Artificial neural network (ANN) is also applied into eye gaze
11 Read more
A driver assistance system (DAS) constantly monitors the behavior of the vehicle, behavior of the driver and road scene events and provides the relevant information to the driver in case of any threat events. This paper presents the automated detection and recognition of road signs, combined with the monitoring of the drivers response. The complete system reads traffic signs, compares the drivers gaze, and provides immediate feedback if it appears the sign has been missed by the driver. The driver video captured is processed to detect the eye pair and determine the eye gaze using DWT and Neural Networks. The road scene video captured is processed to detect the presence of speed sign board and speed limit on it. Based on these detections driver is alerted. The system is tested with different scenarios and the performance is estimated to 92.308%.
Human beings use their eyes to visualise things around them and to communicate with surrounding peoples. More than half of the human sensory information comes from the eye, so the eye gives many information to the user. By estimating the eye gaze, a person’s presence, attention, focus, drowsiness, consciousness or other mental states can be identified [1-5]. Besides, eye gaze can also be used as a type of Human Computer Interaction (HCI) that helps human, as a user, to communicate with machine or devices. HCI that is based on eye gaze enables the user to interact with devices in a more natural way as compared to interactions through a mouse or a keyboard [6-7].
Eye tracking is compelling for interaction as it is natural for users to direct their gaze to point at targets of interest. However, gaze pointing has inherent limitations in accuracy and precision, as eye movement is jittery and gaze estimation subject to noise and imperfect calibration [20, 36]. In 2D interfaces, this can be managed by spacing of objects and targeting assistance. In contrast, in 3D interfaces, objects can be placed at different depth and depending on viewpoint appear as overlapping in the field of view, resulting in target ambiguity. Figure 1.a&b illustrate the problem. While a user is looking at object B, they might inadvertently select object A as it is close to the line-of-sight and partially occludes B. Ambiguity as to which of the two objects the user is looking at can be resolved with estimation of gaze depth, for comparison against the depth at which the objects are rendered.
12 Read more
The single camera systems can capture limited field images that are fixed at a point in high resolution. The video-based eye tracker is work by illuminating the eye with an infrared light source. This light produces a glint on the cornea of the eye this is called as corneal reflection. The glint has been used for the reference point of gaze estimation. The pupil- glint difference vector remains constant the eye and the head moves. The glint is clearly change location when the head moves, but it is less obvious that the glint shifts position when changing gaze direction. a single camera and a collection of mirrors in a single illumination source to produce the desired effect. The several commercial systems base their technology on one camera and one infrared light.
picture of the driver’s face . Facial Feature Recognition and Monitoring Parameterized Appearance Models (PAMs), for example Active Appearance Models and Morph able Models, are popular record approaches for face monitoring. Mind Pose Estimation In tangible driving situations, motorist’s change their mind pose and facial features while driving. Precisely estimating driver’s mind pose in complex situations is really a challenging problem. Within this section, a three dimensional mind pose estimation product is suggested to decouple rigid and non-rigid mind motion. The mind model is symbolized utilizing a shape vector; the driver’s gaze direction provides crucial information whether the motive force is distracted or otherwise. Gaze estimation is a lengthy standing condition in computer vision. The EOR estimation is dependent on a three dimensional ray tracing way in which uses the geometry from the scene. Our EOR estimation formula computes the stage where the driver’s three dimensional gaze line, our bodies works reliably with motorists of various ethnicities putting on various kinds of glasses . However, when the driver is putting on shades, it's not easy to robustly identify the pupil.
direct or averted gaze shifts (both eyes end up pointing direct or averted) or Mismatching shifts (one eye shifts to a direct direction, the other to an averted direction). We introduced an asynchrony between the left and right eye shifts, and asked participants to indicate which eye shifted earlier (TOJ). We assessed how TOJs were modulated by directional cue information prior to the avatar’s gaze shifts (starting gaze direction), by the type of gaze shifts the avatars performed (direct or averted shifts) and by the face context within which these shifts were set (left and right eyes set within a single face or each eye set within one of two adjacent hemifaces). We manipulated starting gaze direction to study the eﬀect of attentional constraints on gaze shift TOJs. Prior research has shown that starting gaze provides a strong form of attentional cueing (Driver et al., 1999; Schneider & Bavelier, 2003): attention is automatically drawn in the direction the eyes are pointing. Since spatial attention biases TOJs (i.e. the object of attention appears to occur ﬁrst, i.e. ‘prior entry’), we expect on the basis of this that eye shift TOJs will be biased in favour of the eye lying in the hemispace cued by the avatar’s starting gaze direction. For example, participants might have a tendency to report the left eye as shifting earlier when the avatar is initially looking toward the left side of the screen (drawing attention toward the left hemispace). We also manipulated gaze shift direction to evaluate how TOJs are aﬀected by the directional content of left and right eye shifts. Several studies have highlighted asymmetries in the processing of direct and averted stimuli. Direct gaze is known to enhance attention and cognition (Conty et al., 2007; Senju & Hasegawa, 2005). We evaluated four gaze shift combinations: two directionally Matching shifts, where both eyes shift to direct (DD) or both shift to averted (AA); and two directionally Mismatching shifts, where
14 Read more
Consequently, electrodes placed in the vicinity of the eyes can measure these potentials, and the magnitude of these potentials depend on the eye gaze angle. EOG has many practical appli- cations including wheelchair control [7,8], activity recognition [9,10], retinal function testing [11,12], sleep stage classification , or as a general gaze control interface [14–16]. It is also known as an artifact of electroencephalography (EEG) . However, its full potential for hearing aids (or indeed any mobile applications) has not been fully recognized, mainly because the EOG is typically measured by using large obtrusive electrodes that are attached to the sides of the head. Due to the electrical properties of the body, however, EOG can be measured any- where on the head, although its magnitude varies with the electrode placement. When mea- sured in peri-orbital positions it usually has values 8–33 μV/ 1˚ of visual angle, and around 3 μV /1˚ can be measured inside the ear canals [11,18]. This finding suggests that eye move- ments can be analysed by hearing aids with nothing more than conductive ear-moulds.
24 Read more
16 In addition, the visual representation of a mouse cursor was most similar to the dynamic feedback from participants’ own location estimations in the pre-test block, therefore the saliency with which this cue may transmit meaningful number-location information may be greater than that of other dynamic representations. Therefore, if the goal is to explicitly direct an observer to a precise area in the environment or provide a worked example (Atkinson, Derry, Renkl, & Wortham, 2000; Skuballa, Fortunski, & Renkl, A. 2015), then it is understandable that mouse movement may be just as appropriate in this regard. Establishing that the Mouse Cursor and Gaze Cursor models had similar impact on performance emphasises that they both can be a deictic pointing device that can deliberately guide attention (Neider et al., 2010). However, gaze cursors also reflect underlying thought processes that may be implicit to the model and which they are not directly trying to communicate to the observer. That is, even when the model providing the eye movements is not trying to actively and didactically convey the ideal problem solution to subsequent observers in a bid to maximize learning, viewing their attentional processes is nonetheless informative (Litchfield et al., 2010; Litchfield & Ball, 2011). The strengths and weaknesses of these Mouse and Gaze Cursor models may explain the similar error rates observed in these conditions at Time 2.
29 Read more
■ The most common neural representations for spatial atten- tion encode locations retinotopically, relative to center of gaze. To keep track of visual objects across saccades or to orient toward sounds, retinotopic representations must be com- bined with information about the rotation of one ’ s own eyes in the orbits. Although gaze input is critical for a correct allo- cation of attention, the source of this input has so far re- mained unidentified. Two main signals are available: corollary discharge (copy of oculomotor command) and oculopro- prioception (feedback from extraocular muscles). Here we asked whether the oculoproprioceptive signal relayed from the somatosensory cortex contributes to coding the locus of attention. We used continuous theta burst stimulation (cTBS) over a human oculoproprioceptive area in the postcentral
12 Read more
5.3 Puzzle Domain: Speech, Gaze and Deixis Data and Task Our final experiment uses newly collected data (Kousidis et al., 2013), again from the Pentomino domain. In this Wizard-of-Oz study, the participant was confronted with a Pento game board containing 15 pieces in random col- ors, shapes, and positions, where the pieces were grouped in the four corners of the screen (exam- ple in Figure 6). The users were seated at a table in front of the screen. Their gaze was then cali- brated with an eye tracker (Seeingmachines Face- Lab) placed above the screen and their arm move- ments (captured by a Microsoft Kinect, also above the screen) were calibrated by pointing to each corner of the screen, then the middle of the screen. They were then given task instructions: (silently) choose a Pento tile on the screen and then instruct the computer game system to select this piece by describing and pointing to it. When a piece was se- lected (by the wizard), the participant had to utter a confirmation (or give negative feedback) and a new board was generated and the process repeated (each instance is denoted as an episode). The ut- terances, board states, arm movements, and gaze information were recorded, as in (Kousidis et al., 2012). The wizard was instructed to elicit point- ing gestures by waiting to select the participant- referred piece by several seconds, unless a point- ing action by the participant had already occurred. When the wizard misunderstood, or a technical problem arose, the wizard had an option to flag the episode. In total, 1214 episodes were recorded from 8 participants (all university students). All but one were native speakers; the non-native spoke proficient German (see Appendix for a set of ran- dom example utterances).
10 Read more
We used the Twileyed games as data and the device to reflect on the gaze design space, challenges, questions and dimensions of gaze gameplay during a user study with twelve participants. We asked them to play the games and observed their experience and strategies. Based on the ob- servations we discussed through 5 themes, the questions and opportunities to create engaging and playful gaze- enabled games. The resulting discussion arose topics on the challenge of using gaze for both sensing and action, with a focus on attention to understanding the ambiguity of the interaction, and visual dilemmas. Moreover, we dis- cussed the use of ambiguity and metaphors to create en- gaging and playful experiences. Finally, we identified how playing with gaze interaction opens up the space to design with gaze taking different identities. We refer to "identity" to what or which game element the user’s gaze controls. With Twileyed we aimed to engage the gaze-enabled games re- search community to think out of the box.
While such findings give support to the validity of Equilibrium Theory, they did not contribute much to further disentangle what the single or joint effects of the examined behaviours are. In this work, we will create a simulation of a meaningful social encounter in an immersive virtual environment where virtual agents interact with participants in a dynamic fashion. In this simulation, we will be able to let agents change their behaviours dynamically, while participants responses are measured on-line - therefor not sacrificing experimental control. What is more, not only will we manipulate a combination of both gaze and proxemic behaviour during the social interaction, we will also use the technology to put behavioural measures in place that record user responses in these same two dimensions. This, to our knowledge, has not been part of an experimental design in the area so far.
66 Read more
For efficient collaboration between participants, eye gaze is seen as being critical for interaction. Tele- conferencing systems such as the AcessGrid allow users to meet across geographically disparate rooms but as of now there seems no substitute for face to face meetings. This pa- per gives an overview of some preliminary work that looks towards integrating eye gaze into an immersive Collabora- tive Virtual Environment and assessing the impact that this would have on interaction between the users of such a sys- tem. An experiment was conducted to assess the difference between users abilities to judge what objects an avatar is looking at with only head gaze being viewed and also with eye and head gaze data being displayed. The results from the experiment show that eye gaze is of vital importance to the subjects correctly identifying what a person is looking at in an immersive virtual environment. This is followed by a description of how the eye tracking system has been inte- grated into an immersive collaborative virtual environment and some preliminary results from the use of such a system.
The performance of toddlers during the novel Test trials was analysed in a similar way. A main effect of Group was evident in toddlers’ proportionate gaze toward the Referred object, F(2,49) = 6.92, p < .01, partial eta 2 = .22, and this remained highly significant once group differences in age or general developmental level were considered, F(2,49) = 5.60, p < .01, partial eta 2 = .20. We were also concerned that the differences in looking behaviour during the Novel words test trials might be due to children’s willingness to comply with the request to look at the target object, as possibly suggested by the poor performance of the At- risk Poor-skills group in the Familiar words condition. We therefore entered the looking performance during the Familiar words condition as a covariate. The main effect of Group remained significant, F(2,45) = 5.59, p < .01. Post-hoc independent samples t-tests, using the Bonferroni correction, confirmed the interaction to be due to the At-risk Poor-skills group looking significantly less towards the referred object than both the Controls (p < .01) and At- risk Typical-skills groups (p < .01). While control and At-risk Typical-skills toddlers looked significantly longer than chance toward the Referred object (Control: M = .62, SD = .1, t(15) = 4.45, p < .01; At-risk Typical-skills: M = .61, SD = .13, t(23) = 3.90, p < .01), this was not so for the At-risk Poor-skills group (M = .44, SD = .15, t(9) = -1.13, p > .1). None of the significance levels were affected by removing data from the child with a community diagnosis of autism.
37 Read more
Amanuensis” or “The Writer of the Letters”. Each of the rejected titles highlights the autobiographical elements of the story, Hardy himself having acted as an amanuensis for young women in Dorchester, who wished to write love letters to their sweethearts when they were serving as soldiers abroad (Hardy 1999, 67). But in changing the title of his story from “The Amanuensis” to “On the Western Circuit”, Hardy offered a much more fitting title for a story that is concerned with the law, and the institutionalized regulation of human desires. In a story that depicts human endeavours to exert individual free will against inescapable determinism that serves the interest of the human race as a whole, the results of the story highlight forces that are beyond human control – biological determinism, sexuality and class. These forces cannot be overcome, but they can be analysed and understood, and the narrator of “On the Western Circuit” ultimately regains the control that is lost with the gaze of desire, through the application of the gaze of objectivity. 6 In a letter of 1902, Hardy remarked, “Well: what we gain by science is, after
12 Read more
3D environments expose additional challenges for pointing and selection, as targets can appear at different depths, and in larger ﬁelds of regard around the user. The user’s FOV is natu- rally controlled by head movement but there is no universally preferred selection method. The prevalent pointing metaphor for targets beyond manual reach is ray-casting . Gaze has been found to be faster than hand pointing, especially for distant objects . A range of works have compared eye and head pointing showing that eye gaze is faster and less stren- uous, while head pointing is often preferred as more stable, controlled and accurate [5, 10, 18, 23, 44]. As in 2D contexts, eye pointing can be combined with fast manual conﬁrmation by click or hand gesture [41, 46], or with dwell time or other speciﬁc eye movement for hands-free selection [20, 31, 42]. In contrast to the 2D desktop setting, gaze in VR inherently in- volves eye-head coordination due to the wider FOV. This work is ﬁrst to reﬂect how the naturally synergetic movement of head and eyes can be leveraged in design of gaze interactions.
14 Read more
Proposed driver monitoring system based Eye gaze tracking and 3D head pose estimation can help in continuously monitoring and alerting driver in case of eyes off the road (EOR) or distraction and drowsiness. Implementing system which will give audio alerts and wheel vibration alerts depending on situation. To reduce number of accidents caused by distraction is the motivation behind this project in order to improve traffic safety