ICVEs provide a platform in which to conduct AMC, connecting remote or collocated users of immer- sive projection technology systems such as the CAVE within a spatial, social and informational context, with the aim of supporting high-quality interaction [RWOS03]. A typical ICVE system will rely on a combination of specialist hardware and software to deliver a real-time graphical simulation, which is re- sponsive to users’ tracked body movement and behaviour, and operates simultaneously and consistently at multiple sites via a distributed database. The key concept behind ICVEs is that they are shared virtual worlds: computer generated spaces whose occupants are represented to one another in graphical form, and can control their own viewpoints and can interact with each other and with various representations of data and computer programs [BBRG96]. User embodiment and avatars will be discussed in the Section 2.3. The current section presents an overview of key features of ICVEs.
Immersion
Barfield and Furness define a VE as a representation of a computer model or database which can be interactively experienced and manipulated by the users [BFI95]. This definition is broad, and could be used to describe everything from a home video game through to a multi-user CAVE-based VR appli- cation. The critical difference between these two examples is each system’s level of immersion, which defined as the degree to which a user’s sensory input channels to all modalities are stimulated by the VE interface [DKU98]. Thus, the distinction between an immersive and non-immersive VE depends on the capabilities of the operating hardware, particularly regarding displays and input devices.
2.2. Visual Telecommunication 49 both interaction style and user embodiment within the environment [Ste96]. In an immersive system, the perspective of the graphical display is coupled to the user’s head, through use of head tracking, so that the rendered displays match as closely as possible the changes that would be expected in reality when the same motions are made. Correspondingly, the movement and behaviour of a user’s embodiment in an immersive system may be coupled to bodily tracking devices, forming a natural interaction metaphor that maintains the user’s sense of proprioception in the surrounding environment. As discussed in the follow- ing section, this control metaphor becomes critical in a multi-user scenario, as user representation acts directly as a communication mediator [CWS99]. In contrast, the lack of tracking and non-surrounding displays in a non-immersive system means that neither the rendering viewpoint or the behaviour of a user’s embodiment are coupled to the user’s motion, but rather are controlled indirectly through standard input devices. Slater summarises the difference between an immersive and a non-immersive system by stating that, in an immersive system, it is possible in principle to fully simulate what it is like to use a non-immersive system, but not vice versa [Sla09].
A VR system is defined by Cruz-Neira et al. as one which provides real-time viewer-centred head- tracking perspective with a large angle of view, interactive control, and binocular (stereoscopic) graphics [CNSD93]. This positions the CAVE at the apex of what constitutes a VR system, while a standard desktop computer would not be encompassed by the definition, despite its likely ability to run the same VE software. A critical point here is that both hardware and software components must be considered when ‘ICVE systems’ are discussed, and that the two components are may only be loosely-coupled. This implies that two remote users (one located in a CAVE, and one using a mobile device) can conceivably be connected via the same VE software, but interact according to the individual characteristics of their hardware device: full-body tracking may feature in the CAVE and a touch-screen input metaphor may operate on the mobile device. Such an interaction would be therefore be asymmetric: the CAVE user would class the system as an ICVE, while the mobile user would perceive the communication to be taking place using a CVE. Chapter3presents the platform developed and used throughout the research in this thesis, EyeCVE, which operates on a variety of technologies, including fully immersive, semi- immersive, and standard desktop hardware. Figure 2.8shows two users engaged in AMC supported by EyeCVE operating between a semi-immersive VR system, the WALL, and a fully immersive CAVE system.
The impact of such asymmetric interaction has been investigated in Schroeder et al.’s [SSA+01] and Roberts et al.’s [RWOS03] work on object-focused collaboration. Object-focused tasks in multi- user VEs involve the manipulation of virtual objects, for example to solve a logic puzzle or organise the virtual elements into a certain arrangement [HFH+00]. This process is analogous to product engineering
design in the real world, but also provides a paradigm for visual programming techniques [OAS+02] and construction of new VEs [Ste96]. In an experimental scenario, the collaborative elements of the task are emphasised, so that progress toward the goal-state is either dependent on, or significantly aided by, the participants working together. Schroeder et al. [SSA+01] compared pairs of participants collaborating between networked CAVE systems (symmetric) with collaboration between a CAVE and a desktop com-
2.2. Visual Telecommunication 50
Figure 2.8: EyeCVE users engaged in AMC. The local user is located in the semi-immersive WALL system, featuring perspective rendering on a single projection wall. The cointeractant, represented as an avatar, is located in a fully immersive CAVE system featuring perspective stereo rendering within four surrounding display walls.
Figure 2.9: Screen captures provided courtesy of Ralph Schroeder and Dave Roberts. Left: Two partic- ipants collaborating to solve a simplified Rubik’s cube puzzle as presented in [SSA+01]. Right: Three participants working together to build a virtual gazebo as documented in [RWOS03].
puter (asymmetric), and also with the same task performed in the real world. Taking puzzle-completion time as the primary metric of task performance, participants took longer in the CAVE-to-desktop con- dition than the CAVE-to-CAVE and collocated conditions, between which no significant difference was found. These results are echoed in Roberts et al.’s work on constructing a virtual gazebo, where partic- ipants working together in CAVE systems were able to complete the task in significantly less time than those who performed the task between asymmetric systems, or using non-immersive systems [RWOS03]. Figure2.9shows screen captures depicting the collaborative interaction scenarios investigated in two experiments, reported in [SSA+01] and [RWOS03]. Both studies revealed interesting findings re- garding collaboration and group dynamics. When asked, post-collaboration, to judge their contribution to the task, desktop users evaluated their share as less than that of CAVE users. In the Roberts et al. study, users of immersive systems were considered by all (themselves and desktop users) to contribute
2.2. Visual Telecommunication 51 more than desktop users. Furthermore, where a team comprised of two immersed and one desktop user, the latter was left out of most of the activity. Similar results are observed in [SSUS00] and [WAS+00], in which users reported that they contributed unequally (favouring immersed users) despite being unaware of what type of system their partner was using. Another implication of a system’s level of immersion is on leadership, which is seen to be dynamically, and automatically, assigned to the more immersed user during a collaborative session [SSS+99,SSUS00]. Thus, similarly to the different forms of VMC systems, which support varying quality of communication, there also exist several technological tiers of multi-user VE systems, of which, degree of immersion is the primary variable.
Presence
The central feature of users’ response to VR is presence, which may also referred to as place illusion (PI) [Sla09]. Immersion describes a system’s hardware, and also provides the boundaries within which presence can occur. Presence is best defined as a user’s psychological response to patterns of sensory stimuli, resulting in the user having the impression of “being there”, in a computer-generated space [SUS94]. Slater explains that immersive VR systems can be characterised by the sensorimotor contin- gencies (SCs) that they support, referring to the actions that a user can carry out in order to perceive the VE [Sla09]. For example, moving one’s head and eyes to change gaze direction, or bending down in order to see underneath something are SCs of a typical immersive VR system. The set of SCs supported by a system are analogous to the system’s level of immersion, and define the set of valid actions that are meaningful in terms of perception within the VE. When immersed within a VE, sense of presence maintained through SCs that provide synchronous correlations between the act of moving and the con- comitant changes in the images that form perception. For instance, Pan describes an experiment in which a virtual representation of a woman interacts with participants through verbal and nonverbal behavioural cues [PS07]. Due to the head tracking that featured in the CAVE system in which participants performed the experiment, when participants moved, the image of the virtual woman in their visual field updated as it would be expected in reality. This is an example of a SC. During those experimental sessions, all participants found themselves automatically responding to this illusory woman, by talking and behaving comparably to as they would be likely to in reality. When immersed in a VE, such personal and plausible feedback is critical in maintaining the illusion of reality. Anecdotal evidence from post-experimental interviews strengthen evidence of this automatic response of presence: “The idea of cheating on my partner with this virtual woman caused a real physical and emotional response; this was the strongest and most surprising aspect of the experience” [PS07].
There is no single method for measuring a user’s sense of presence. Perhaps the most common approach is to use questionnaires which aim to elicit subjective responses regarding the VE experience. This was first demonstrated in [SU93], and a refined questionnaire later appeared in [SW97]. The ques- tionnaires generally feature ordinal Likert scales that anchor responses between two extremes, and have been shown to be effective in eliciting meaningful responses in many cases. However, there are a num- ber of criticisms to questionnaire-based measurement of presence: they been shown to be unstable, in that prior information can influence the results [FAPI99], they may be unable to discriminate between
2.2. Visual Telecommunication 52 presence in a VE and physical reality [UCAS00], and they may be prone to methodological circularity (asking questions about PI may foster the very phenomenon that the questionnaire is supposed to be measuring) [Sla04].
More robust and compelling measures of presence involve analysis of behavioural response to the VE stimuli. If participants in a VE behave as if they are in an equivalent real environment, this is a sign that they are experiencing presence. For example, the pit room is a classic VE which has evolved through several iterations to assess aspects of presence [UAW+99,Ins01,MIWBJ02]. The experiment recalls Walk and Gibson’s classic studies investigating responses to a ‘visual cliff’ [GW60]. The visual cliff consists of a narrow platform supported by vertical sides that drop a few inches to a large plate of glass. Both human infants, and rats [WGT57] have been shown to demonstrate avoidance behaviour when confronted with the visual cliff. The VE that forms the stimuli for the presence experiments consists of two adjacent rooms: the starting room is unremarkable, populated with a few chairs and blocks, and has the purpose of familiarising the user with the use of the VR system and learning procedures for interaction and navigation. Upon walking into the second room, the user is confronted with a three- metre drop surrounded by a narrow walkway. When given the task of navigating to the opposite side of this room, almost all individuals do so by carefully edging themselves around the ledge of the room, even though they know that there is no real danger. In an experimental scenario, a participants reactions to the pit have been quantified through physiological measurements such as heart rate, respiratory rate, and galvanic skin response. When compared to the baseline level of the first room, people experience significantly heightened physiological response. Meehan and Razzaque investigated the influence of passive haptics on response to the pit, in which real (but small) ledges were aligned to their virtual counterparts [MIWBJ02]. This added to the effect of standing over a real pit, and significantly increased the heart rate of participants compared with when the ledge was absent. Thus, if the normal physiological response of a person to a particular situation is observed in a VE, this is a sign that the user is experiencing a high level of presence. The obvious drawback of such measurement methods is that they are limited to situations in which there is a significant physiological response in reality, so are less useful in mundane situations [SVS05].
The issue of social presence, also referred to as copresence arises when discussing multi-user VEs. Copresence is highly relevant to user representation and embodiment, and therefore is discussed in the following section on avatars. Spatiality, a feature of ICVEs that renders them particularly germane to natural remote social interaction forms the final discussed in the current section.
Spatiality
In an early evaluation of teamworking in AMC supported by non-immersive CVEs, Hindmarsh et al. suggested that some of the limitations to natural collaboration that had been observed in such non- immersive systems could be alleviated by their immersive counterparts [HFH+98]. A subsequent study by the same authors confirmed the predicted benefit of intuitive bodily tracking (head and hand) and surrounding nature of the displays, that provided users with a field-of-view similar to that which is available in natural perception in reality, to enable more natural interaction [HFH+00]. Investigating
2.2. Visual Telecommunication 53
Figure 2.10: Comparison between visual telecommunication systems in terms of Benford et al.’s three dimensions of spatiality [BBRG96]. Note that to ensure consistent polarity of the axes, artificiality has been positively renamed as reality. System types include desktop VMC, “telepresence” VMC (e.g. MAJIC, telepresence systems), non-immersive CVEs, semi-immersive CVEs, and ICVEs. Face-to-face collocated interaction is also included.
various collaborative tasks with the same participants over extended periods of time (a situation that is closer to the practice of actual collaborative work), Steed et al. conclude that interaction in ICVEs is intuitive, and particularly suited to highly spatial and interactive tasks [SSH+03].
Benford et al. suggest that the increased interest in spatial approaches to CSCW might be viewed as a shift of focus towards supporting the context within which work takes place, rather than the process of the work itself [BBRG96]. The authors suggest two dimensions representing the fundamental properties of spatiality by which visual telecommunication systems may be classified: transportation is analogous to presence, and is concerned with the extent to which users perceive that they have left behind their local space and have entered into some new remote space, while artificiality concerns the extent to which the space is either synthetic or is based on the physical world [BBRG96]. In addition, the authors state that spatialityis the primary dimension on which system types may also positioned, and is the degree to which a system supports key spatial properties such as containment, topology, distance, movement, and a shared frame of reference. Figure2.10positions desktop VMC, “telepresence” VMC, non-immersive CVEs, semi-immersive CVEs, and ICVEs along Benford et al.’s [BBRG96] spatial dimensions.
As indicated by the high dimension of reality in Figure2.10, VMC systems are likely to remain superior in terms of presenting the truthful appearance of fellow interactants in their remote environ- ments. However, even state-of-art VMC struggles to provide high levels of immersion and presence, as denoted by the low transportation dimension, in a perceptually shared space, denoted by the slow spa- tiality dimension. High dimensions of spatiality and transportation are inherent to highly-immersive VR systems, enclosing users in wholly synthetic environments at the price of faithful replication of reality.
2.3. Avatars 54