CHAPTER 4 A FRAMEWORK FOR INTER-REFERENTIAL AWARENESS
4.1. A Process-Driven Framework
The difficulty in generating meaningful references to objects within the environment varies with the communication medium, the application domain and context. Many distributed groupware systems support the use of audio, video, text, 2D and 3D space, and may be either synchronous or asynchronous. Devices for interacting within these interfaces can be awkwardly stretched to work across different dimensions, such as a mouse interacting with 3D content. Objects may be in any number of states, and may be referenced through various criteria. Though we most often focus on the challenges of distributed virtual space in CSCW, objects need not be digital - as real- world objects, such as those in mixed reality environments, are part of the natural context in shared spaces. Similar to virtual 3D content, these objects acquire additional spatial properties, such as proximity to participants and occlusion by other objects.
The Inter-referential Pipeline
We begin at an abstract level, viewing inter-referential awareness as a sequential process of selection, representation and acknowledgement (see Figure 20). In this
context, the environment contains an implied set of participants and set of objects. We describe selection as an atomic process in which, through the actions of an
individual, a set of objects is chosen for reference. It is possible to decompose selection into a cognitive cycle (the mental process of determining the selection), and a physical
cycle (the act of making the supportive system aware of the objects). While the cognitive
cycle must always occur, the physical cycle happens only if a computer-generated representation will support the reference20. Though specific to AR, when selecting physical artifacts for reference, the physical cycle often does not occur.
Figure 20 - The referential pipeline
We define representation as the means through which the attention of others is
directed to a set of selected objects. Representation techniques are often visual, such as highlighting or alternate visualizations. Though discussed in more detail below, pointing is an example of a visual representation, just as the concomitant deictic speech that normally supports the reference is an auditory representation. This may at first seem counter-intuitive, but it can be argued that pointing is to make others aware of an object. One might suggest that this logic fails when discussing gestural interfaces – where the
act of pointing informs the system of an object of reference; however, this is an alternative method in the physical cycle of the selection phase. Needless to say, the distinction between the selection and representation phase is subtle - especially when
20 It will be interesting to see how brain interfaces might merge these two cycles into one. Thus, one
addressing non-verbal referencing, such as changing ones pose, eye gaze, gesturing or using deictic speech. It is important, however, to clearly differentiate between the two.
The final phase of the pipeline is acknowledgement, which is the optional act of
recognizing a reference and responding; this phase is heavily dependent on context. Of interest is how formally this occurs, as well as how it affects the behaviors of participants when this phase is absent. In some systems, an acknowledgement may be a gesture, utterance or physical action [6]. In other systems (e.g. distributed collaborative surgery), a guaranteed acknowledgement becomes increasingly important; ensuring that
the reference was unambiguous can be of extreme benefit in mission-critical applications.
The Inter-referential Life Cycle
The pipeline described above describes how referencing occurs chronologically; however, it is necessary to incorporate other factors that influence it, including the available channels of communication, common ground between participants, relationships between artifacts and participants, as well as the properties of those objects; this creates the inter-referential life cycle. By intentionally avoiding domain-
specific techniques, it can be more easily merged with existing ontology. For example, when applying this model to VR, the ontology developed by Bowman (on 3D selection techniques) can classify the selection techniques available for that domain. Figure 21 shows the integration of objects and participants as well as the relationships that exist among them. Further, it lists several of the spatial Object-Actor relationships and object states found within collaborative AR. Though more formally defined later, the figure is summarized here for clarity. The process begins with an initiator who has a set of
relationships with one or more shared objects. Using some selection technique, a set of objects of reference (0 or more) are chosen and represented to a set of reference receivers, each of whom have relationships with the objects. An acknowledgement may or may not be generated for the initiator by these receivers, though our studies show this can negatively affect referencing behavior. Note that the life cycle is independent of time, and therefore applies to asynchronous environments as well.
Figure 21 - The inter-referential life cycle (applied to AR)
The initiator and receivers share a context, which includes common ground,
multiple channels of communication and the collaborative task. When applied to collaborative AR, these channels of communication may include spoken audio (or VoIP
for remote scenarios), shared video, object states (e.g. pose), or contextual information about other participants (e.g. current visualization). The figure above also lists domain- specific relationships that exist between participants and objects as well as the states those objects may be in; these states and relationships are listed in more detail in the next section.