Though games especially benefit from intention, our work has broader implications within HCI. First, our definition of a sound slider is generic: a virtual speaker that indicates a value within a range by its position on a 3D line segment in the soundscape. For blind users, sound sliders can substitute for traditional UI sliders; brightness, temperature, or pressure gauges; progress bars; and any other display that displays a value within a range. They can also help users perform steering tasks in the classical sense [Accot and Zhai 1997] by representing a tunnel’s width.
Furthermore, the RAD can be used in place of AudioGPS [Holland, Morse, and Geden- ryd 2002] and SWAN [Wilson, Walker, Lindsay, Cambias, and Dellaert 2007] for pedes- trian navigation tasks. AudioGPS and SWAN tell users know which way to walk, but the RAD can tell users how wide the path or bridge is, how much “wiggle room” they have, and whether they are in the middle or toward one side, helping them avoid oncoming foot traffic.
6.7
Discussion
This chapter offers a vision of how video games can go beyond just being blind-accessible to being equivalently accessible to people who are blind, allowing them to play with a similar sense of control (intention) and efficiency as sighted players can. To this end, we introduce the racing auditory display (RAD) to help racing games become equivalently ac- cessible to people who are blind. It comprises two novel sonification techniques: the sound
slider for understanding a car’s speed and trajectory on a racetrack and the turn indica-
tor system for alerting players of the direction, sharpness, length, and timing of upcoming
turns.
Through a pair of empirical studies, we found that players preferred the RAD’s interface over that of Mach 1, a popular blind-accessible racing game, and at times “felt like [they] had as much information as if [they] could see the track” (P1). We demonstrated that the RAD makes it possible for a gamer who is blind to race comparably to casual players using sight.
In Section 7.2, we describe the limitations of this chapter’s user studies, the limitations of the RAD itself, and what we believe to be promising future work for user interfaces such as the RAD.
Chapter 7
Conclusions, Limitations, and Future Work
7.1
Summary of Contributions
In this dissertation, we described the concept of unmediated interaction as an interaction modality that we should strive for when designing computing devices to reduce or eliminate the burden of using those devices. We identified two instances of that burden: the overhead that users must undergo to provide input to the device, which we called input overhead, and the overhead that users must undergo to interpret output from the device, which we called output overhead. We argued that by eliminating input and output overhead from our interaction with devices, we can make it seem like those devices are not even there and that we are accomplishing computing tasks using our own abilities or powers rather than intermediate devices.
We then, in the bulk of this dissertation, introduced three computational methods for reducing input overhead and one for reducing output overhead. The methods cover a broad range of domains and intersect several fields including machine learning, computer vision, optimization, acoustics, and game design.
First, in Chapter 3 we show how we can make it possible to eliminate the need for user inputs altogether via input data mining using input words. Namely, we show how prob-
abilistic topic models such as latent Dirichlet allocation and the player–gameplay action model, the latter of which we develop, can help us draw insights about video game players and the levels that they are playing — insights that can be used as a basis for recognizing players and personalizing their experience, all without their explicit input.
Next, in Chapter 4 we introduced gaze locking, a novel interaction modality for provid- ing basic input in a nearly instantaneous way. Gaze locking is the notion of sensing eye contact directly from an image using a standard camera or existing images such as ones on the Web. By simplifying the continuous gaze tracking problem into the binary gaze locking problem, our gaze locking detector can exploit the special appearance of direct eye gaze, allowing devices to sense eye contact with over 90% accuracy at distances of up to 18 m. This in turn allows people to interact with computers, devices, and other objects just by looking at them.
In Chapter 5 we investigated how to make typing on small devices faster and less error- prone than it is when using the standard Qwerty keyboard. This work addresses instances in which users must provide devices with complex input and in which simply looking at the devices would not be enough to specify that input. Specifically, we explored how to modify Qwerty to make word gestures that are used for gesture typing shorter and more distinct, and how to do so in a way that prevents users from having to learn how to type all over again. By performing a rigorous optimization procedure using three metrics that we develop, we discovered keyboard layouts that are not too different from Qwerty and that can reduce error rates by 52% over Qwerty.
Last, in Chapter 6 we investigated the problem of reducing output overhead to make racing games accessible to people who are blind. We introduced the racing auditory display
(RAD), an audio system that works with a standard pair of headphones and that makes it
possible for people who are blind to play the same types of racing games as sighted players can with a similar speed and sense of control to what sighted players have. The RAD works by using computation on the current game state to present players who are blind with stimuli that allows them to make the same moment-to-moment decisions that sighted players make while they race. We found that We also found that the RAD allows an avid gamer who is blind to race as well on a complex racetrack as casual sighted players can, without a significant difference between lap times or driving paths.
Together, we hope that these systems open the door to even more efforts in unmediated interaction, with the goal of making computers less like devices that we use and more like abilities or powers that we have.