• No results found

1.4 Contributions

2.1.4 Section Summary

The properties of the hardware will clearly affect how we design our software to interact with it. This is important for the design of the Cruiser framework, particularly because at this stage of development we do not want to commit to one specific type of hardware. Factors to consider include:

• whether interaction is constrained to purely single- or multi-touch;

• whether touch points may reliably be identified as particular fingers, users, or hand/ object shapes;

• whether interaction may be via a tool (e.g. a stylus);

• whether physical “clutter” on the table leaves interaction unaffected (e.g. if it prevents, or creates spurious input);

• whether the display may be occluded or have shadows cast upon it (e.g. when projected from above, meaning that a user leaning forward may occlude the display); • whether the display works in various (or adverse) lighting conditions;

• the ability to interface with other artifacts (e.g. paper, laptop computers);

• screen privacy (i.e. allowing parts of the display to only be seen by a particular user); • the optimal display size and resolution (dot pitch);

• a large field of view, and orientation independence (i.e. unaffected by viewing angle, or perhaps deliberately tailoring the view for specific angles);

• whether it is robust, with regard to the potential for breakage, spillage, and use as a

normal table (for reading, placing objects upon, etc.);

• the responsiveness (interaction lag); • setup and calibration requirements;

• whether there is parallax error for touch (e.g. when rear-projected).

From these, one could derive the properties that the ideal tabletop hardware might have. But with current technologies, the properties are not mutually exclusive. For example, overhead-projected displays allow the surface to have arbitrary robustness and good field of view, but create shadows, require calibration and do not work well in brightly-lit environments; rear-projected or embedded displays can be covered in thick glass or acrylic

for robustness but this causes parallax error and can limit viewing angles (e.g. due to reflection or the screen technology). This list is not exhaustive – the hardware technologies are moving quickly, and more properties will emerge; perhaps some that currently exist primarily in peoples’ imagination (e.g. interactive volumetric displays).

While hardware is not the focus of this research, Cruiser has been designed for platform independence (§1.2.3). Yet, to be able to achieve a workable design, it is essential to design software aspects of tabletop systems with a grasp of the capabilities of the tabletop hardware. The Cruiser system has been tested on DiamondTouch, Mimio, SmartBoard and the Braccetto table made by JumboVision, as well as traditional mouse, for which aspects of the application hold promise for those not using an interactive tabletop interface.

2.2

Interaction Techniques

This section covers interaction techniques and observed behaviours that are decoupled from a particular piece of hardware or software. This research is important because it takes the focus away from an implementation in order to highlight human behaviour, potential pragmatic problems in tabletop interface design, and more generalised interaction techniques for collocated tabletop collaboration.

All research mentioned so far has involved some novel hardware prototype and the applications researchers have built around it. In what follows, researchers have taken a step back to consider the core problems involved and present findings that could be relevant to any tabletop, gestural or collaborative interface.

2.2.1 TableTop Interaction

Given the novelty of tabletop interfaces, there has been little work with a focus on design for tabletop interfaces. There has been some foundation work such as Scott [2005] who studied users performing collaborative tasks and managing images at a tabletop towards the design of personal spaces and storage bins. Scott, Grant, and Mandryk [2003] established a set of guidelines, which suggest that technology must support:

1. natural interpersonal interaction, 2. transitions between activities,

3. transitions between personal and group work,

4. transitions between tabletop collaboration and external work, 5. the use of physical objects,

6. accessing shared physical and digital objects, 7. flexible user arrangements, and

8. simultaneous user interactions.

They also survey the extent to which hardware research at the time provided support for these guidelines. Hardware has since changed, and an updated version of guidelines may be better suited to help evaluate current tabletop interfaces (see §4.1.1).

Nacenta, Aliakseyeu, Subramanian, and Gutwin [2005] examine “Reaching techniques” for tabletop displays. That is, techniques for transferring (with a stylus) an object from the screen of a TabletPC, onto a specific location in a tabletop environment on which the TabletPC is placed. They compare six techniques and find the best performer to be a “Radar” technique, where a reduced representation (map) of the surrounding environment appears

2.2. Interaction Techniques CHAPTER 2. BACKGROUND

Direct Input Devices

Advantages Disadvantages

support natural, fluid gestures user may become tired support coordination through greater aware-

ness of intention and action

items on far side of table are difficult to reach

allow for noticeable gestures noticeable gestures may be distracting input device may obscure display

users may physically collide in workspace

• device may be seen as “invasive” into partner’s territory on display. This may improve coordination, or may unnecessarily restrict activity in some regions of the display.

Table 2.1: Properties of direct input devices (findings of Ha et al. [2006])

when the object is touched, into which the pen is moved to indicate the desired position. The other techniques examined, in decreasing order of performance, are: Pick&Drop, Pantograph, Sling Shot, CoGestures and Press&Flick.

Ha, Inkpen, Mandryk, and Whalen [2006] explored how pairs of users interact when seated face-to-face at an interactive table. They conducted three studies: on input devices and collaboration, on awareness of intention, and on awareness of action. The studies compared how users behaved when using a mouse vs when using a stylus (a Polhemus Fastrak stylus). A summary of their findings is in Table 2.1.

Drawing on ideas and observations from tested tabletop interaction techniques are valuable when informing the design of new systems. In particular, they can help avoid repeating mistakes, and assist in choosing techniques that are well suited to the interaction task. This is particularly important for tabletop interfaces whilst the toolkit support is still in its infancy.

Coordination Policies

Coordination policies – whether they be informal (tacit), formal, or enforced by the interface – allow multi-user interaction to proceed smoothly. Morris, Ryall, Shen, Forlines, and Vernier [2004b] investigate coordination policies and scenarios for collaborative touch interfaces in order to overcome a tendency for users to ignore social protocols in studies using DiamondTouch. The work presents a number of global and whole-element coordination policies and scenarios, which they categorise into proactive, mixed-initiative and reactive initiation types. However, exploration of the necessary toolkit-level support is left for future work.

In the design of PhoTable, we have mostly relied on social protocol for co-ordinating between storyteller and listener. Clearly, the impact of ignoring social protocols is highly dependent on the task. For photo sharing, we anticipate the storyteller will take a dominant role, but also would not want an enforced policy to disrupt the naturalness of the interaction. However, the Cruiser framework does incorporate a variety of mechanisms that can be used to enforce coordination policies in the software, such as concepts of object ownership and restricting access to parts of the display (see §5.2.5).

Indirect Input Devices

Advantages Disadvantages

allow items on far side of table to be easily accessed

reduce the amount and range of gestures

do not require much physical effort to use multiple cursors may be distracting or con- fusing

may be more familiar to users subtle gestures may go unnoticed small pointer does not obscure elements on

display

lesser support for awareness of intention and action may impede coordination and collab- oration

• space must be left on tabletop to accommodate device (close to user)

• user more likely to cross territorial boundaries with indirect device than direct device

Table 2.2: Properties of indirect input devices (findings of Ha et al. [2006])

2.2.2 Gestural Interaction

Early work by Minsky [1984] introduced the concept of gestures for single-user touch screen interfaces. This work developed a screen environment and gesture system called ButtonBox that recognised selection, move and path gestures for manipulating virtual buttons. There were also visual components, such as the copy button, which could be moved to overlap any other button then tapped to produce a copy of the overlapped button, and a knife to remove components. These elements are functionally similar to the Frame and Black Hole in Cruiser although are semantically different.

This gestural interaction work predated recent advances in computer graphics hardware [Meinds and Barenbrug, 2002], digital photography [Rodden and Wood, 2003] and multi- user touch technologies (e.g. Dietz and Leigh [2001]). The advent of such technologies has given rise to an opportunity for multi-user gestural interfaces using digital photography for high-resolution photo-sharing activities, which is a focus of this thesis. Without these advances, there would either be insufficient computing power for interactive photograph manipulation, lack of interest in digital photography, or lack of a multi-user computing interface with which to facilitate social interaction around the photographs. Yet, using gestures for interaction is not new.

The term gesture has been interpreted by researchers to mean different things. In Minsky’s work, both the single-touch manipulations of on-screen objects, and the actions of the knife and copy buttons upon other objects were gestures. The latter, tool-based interaction is an interesting approach to the incorporation of gestures in interactive systems, and is a particular inspiration for some of the techniques used in Cruiser. This also raises the issue of metaphor for leveraging users’ tacit knowledge to assist usability and, in particular, learnability. However, only some research interprets gesture in this fashion.

In other contexts, a gesture can involve the drawing of a glyph using a stylus, or a sequence of touches incorporating parts of (or whole) hands, and multiple touches. These interpretations will be discussed in the following subsections. Elsewhere in this thesis, a

gesture simply refers to any kind of structured user input; directly or via a virtual tool in

2.2. Interaction Techniques CHAPTER 2. BACKGROUND

2.2.2.1 Pen (Trace) Gestures

For pervasive computing interfaces and, in particular, collaborative tabletop interfaces it becomes undesirable and impractical to supply each user with a keyboard and mouse [Kraut, 2003]. However, it is still necessary sometimes to execute a command which, on a regular display, would traditionally be done by a typed statement, key combination or menu selection. Traditional menus suffer from poor readability, inaccurate selection, orientation problems and significant occlusion of the menu items from users hands when used on interactive tabletop interfaces. Work such as DiamondSpin [Shen et al., 2004] used a radial menu system for executing a larger set of non-implicit commands. An alternative, is to issue commands by drawing a glyph on the interface using a pen or fingertip. This is sometimes called a pen gesture or trace.

Mohamed, Haag, Peltason, Dal-Ri, and Ottmann [2006] introduce the idea of disoriented pen gestures – an extension from earlier works in pen gesture technology for conventional screen environments. They test their techniques for gesture recognition in a tabletop Ink Environment in the context of a turn-based board game – Monopoly. Their system can determine the direction (i.e. North, South, East or West) and hence the player, and thus determine if a command is made out of turn, for a set of asymmetric gestures. In particular, they discuss the demand rent gesture (an “N”), which must be performed out-of-turn. The system is able to classify approximately 98% correctly. However, toolkit-level support is not discussed.

In his PhD thesis, Long [2001] explores techniques for rapidly learning and recognising

oriented pen-based gestures using a feature-based algorithm. His system, called Quill,

is intended as a design tool for rapidly prototyping a recognisable gesture language for pen interfaces, such as for the TabletPC. However, orientation-independence was not considered. This, combined with a Java implementation, was poorly suited to incorporation into Cruiser.

More recently, Wobbrock, Wilson, and Li [2007] presented the “$1 recognizer” which appears to solve many of the problems faced in incorporating a recogniser for commands on an interactive tabletop. Implemented in approximately 100 lines of pseudocode, in its evaluation it was able to match the correct gesture amongst 16 possibilities with over 99% accuracy. It is also orientation independent; important for a multi-user tabletop. Determining also the particular angle of the gesture (in order to determine the user performing it) is not discussed in their work, but it is clear from the elegant pseudocode implementation that this could also be reported by the recogniser. Although clearly the gestures must also be designed for this, as any with rotational symmetry will limit the range of angles that can be reported.

However, $1 is foremost a recognition algorithm and does not provide ways to design and evaluate gestures. iGesture [Signer et al., 2007] is another Java application that, like

Quill, supports gesture designers in creating new gestures. It incorporates a number of

recognition algorithms, and provides tools focusing on extensibility and cross-application reusability. It is released under an open source license and actively developed3.

Incorporation of a gesture command system was one of the goals considered for Cruiser. Much of the infrastructure required is already in place and well tested. This is actualised in the form of the Gesture (§5.2.10), Command (§5.2.5) and Writing4 (§5.2.4) components of the framework. However, the implementation of the recogniser itself was put off. Now, the recently published $1 recogniser makes it straightforward to implement a command gesture recogniser as a plugin.

3

http://www.igesture.orgverified 2008-05-04.

2.2.2.2 Multi-Touch Gestures

Wu and Balakrishnan [2003] also consider gestural inputs. The work explores a grammar of twelve, multi-finger and whole hand gestures in RoomPlanner , a collaborative furniture layout application. They present qualitative feedback from five users who each participated in a one-hour trial while collaborating with one of the authors. The feedback suggested that learning the gestures did not take much time. Issues of occlusion when selecting menu items, and some some limitations of the hardware (DiamondTouch) were noted.

Wu, Shen, Ryall, Forlines, and Balakrishnan [2006] add a stylus to the DiamondTouch, which effectively presents itself as an additional user to the hardware. They evaluate a system that uses two-handed and whole hand gestures for moving images, annotation, erase (wipe), cut/copy and paste, “piling” and “spreading” photographs. Participants from outside the lab were given an interactive tutorial demonstration and asked to rank the difficulty of the in-house gestures and state their agreement with a collection of statements using a Likert scale. The study noted issues of granularity of direct-touch systems, with some users having difficulty with pixel-accurate selection. Also highlighted, was the importance of visual feedback throughout a gesture interaction.

Piper, O’Brien, Morris, and Winograd [2006] have also investigated the application of cooperative gestures and uses of the tabletop in collaborative educational software. This work focuses on building a shared interface to help develop effective social skills (called SIDES) for children with special needs. The interface for DiamondTouch is a four-player game where each user takes turns to place directional tiles to create a single, agreed upon path for a frog to collect insects, gaining points. The game was engaging, and the study identified many benefits of tabletop hardware (particularly when it can identify which user is interacting). It also highlighted significant challenges for designing an interface to encourage social interaction, but this was exacerbated by the target population of the study.

Multi-touch is so far only available in some hardware, and introduces complications in making the interaction robust. These problems of robustness go mostly unreported by researchers, but are gradually being solved. Manipulation of photographs on a version of the Microsoft Surface, for example, was not robust when using a single touch point – the system has to distinguish a “fat finger” from two adjacent fingers, which would rotate and resize, rather than move. Multiple users further complicate robustness issues, such as when two users touch the same photo to move it. Systems without user identification (DiamondTouch being the only exception) can only interpret this as a rotate/resize gesture.

However, my own experiences with the multi-touch features introduced in Apple products have been more favourable. The iPhone, iPod Touch, MacBook Air and 2008 models of the MacBook Pro feature these multi-touch capabilities. I have yet to “fool” the device into thinking my two fingers were one. Unfortunately, the algorithms that accomplish this are likely to remain proprietary and, to date, Apple have not provided third-party application developers API-level access to these features, as used by their own applications such as iPhoto.

The precursor to Cruiser was targeted specifically for DiamondTouch hardware, and the multi-touch aspects of the Cruiser framework have retained support for this. However, even with user identification, we quickly ran into robustness problems with multi-touch gestures (see [Apted et al., 2006], §7.1.6.2 and §A.3.1). These problems, and the limited available hardware support for multi touch have resulted in Cruiser and PhoTable primarily being used only with single-touch features enabled.

2.2.3 Interaction Design

2.2. Interaction Techniques CHAPTER 2. BACKGROUND

contrast rotation and translation mechanisms, and survey existing techniques. Independent (translation/rotation), automatic rotation (continuous or discrete; i.e. oriented towards edge of a round table, or the edges of a rectangular table), integral (i.e. based on a physical model) and two-point (i.e. using two points of contact) are the techniques examined. Comparison metrics are presented in terms of degrees of freedom (for completeness and consistency). The required infrastructure to support the range of mechanisms, and user evaluation [Forlines et al., 2005] is yet to be reported.

Cruiser uses a technique that combines rotation and resizing, rather than translation (§4.2.4). This technique is well-suited to manipulating photographs, as the aspect ratio generally does not change, and was found to work well in early usability studies (§7.1). However, Cruiser includes a flexible Gesture framework (§5.2.10) that would readily support alternative interaction techniques that could be enabled via a plugin.

Terrenghi, Kirk, Sellen, and Izadi [2007] compare affordances for manipulating digital vs physical items on interactive surfaces. Twelve participants provided 80 of their own, most recent digital photographs which were divided into two groups – a set of printed, physical photographs, and a set to be accessed digitally on an interactive tabletop. The first stage involved constructing a 25-piece puzzle – both a physical and a digital puzzle for each participant. In the second stage, participants were asked to sort their photographs into 3 groups – a set to discard, a set to keep but not share with others, and a set to keep

and share.

Participants took longer to solve the digital puzzle, but the task duration of the sorting tasks was not significantly different. One interesting observation was the prevalence of bi-manual interaction in the physical tasks where, for digital tasks, one-handed interaction was more prevalent, with participants observed to use their non-dominant hand to support the weight of the body over the table. Some of the qualitative observations included a tendency for participants to lean closer to the table surface to focus on a single item, while in the physical task the photo itself was brought closer. This highlights the importance of being able to enlarge images easily, and perhaps even as an action separate from rotation and translation.

Studies such as these are informative if we are to base our own digital interactions