• No results found

2.5 Chapter Summary

3.1.2 Distribution Architecture

In a distributed VE system aiming to support real-time telecommunication, the combined delay from acquisition, computation, communication, and display are perceptible, and can impact the interpretation of nonverbal cues. EyeCVE adopts the replicated database approach, which is common to distributed and real-time simulation and visualisation, and allows responsive user feedback. The architecture of EyeCVE, showing a simplified unidirectional view of information flow, is presented in Figure3.1.

Based on input from dedicated tracking nodes, each client simulates the current state of the local user’s avatar, and sends updates to peers via a server. Upon receiving such updates, remote clients then render avatar behaviour within the VE. Each client holds a replication of two tightly-coupled databases: the simulation object model, which describes the behaviour of the virtual world; and the scene graph, which describes its appearance. For example, head and eye movement are updated in the simulation object model which, on remote sites, ties to respective parts of the scene graph, which visualises the database. The scene graph hierarchy follows that of the simulation object model, but adds specific data defining object appearance such as texture maps and geometric models. Rendering of the virtual scene is performed using OpenSG [RVB02], which supports a wide range of immersive display types including tiled, panoramic, L-shaped, and cubic. Audio communication between users interacting via EyeCVE is handled externally by Google Talk [Goo10b], which maintains low-latency Voice over Internet Protocol (VoIP) communication.

Time management is necessary to balance local responsiveness with consistency across client repli- cations, both in terms of synchronisation and causality. Like many HITL simulations, the majority of update messages in EyeCVE describe discreet absolute movement as opposed to the difference from the previous state. This implies that the system can recover from a lost update as soon as a subsequent update for the same object arrives. It also means that preceding updates become obsolete and will not be further processed. Due to the temporal nature of real-time communication, it is preferable to transmit, process, and display updates as quickly as possible rather than to ensure that every update is received. However,

3.1. EyeCVE System Overview 83

Figure 3.1: Distribution architecture of EyeCVE. Two sites are shown. Site A tracks a CAVE user’s gaze, head, and hand motion, and distributes updates via the client software to Site B, where the tracked motion is mapped onto an avatar. The relay server facilitates login and message passing across a group of distributed users.

this must be balanced with the need to ensure both the delivery and sufficient causal ordering of critical events such as a user picking up an object and passing it to another user, or clients joining and leaving. These vital events are expected to occur far less than non-vital events, so overhead from bandwidth and ordering of reliable transmissions is minimal. EyeCVE’s network layer was implemented using RakNet [Sof10a], which uses the User Datagram Protocol (UDP) for transmission, but employs additional mech- anisms to assure reliable transmission of vital update messages. Ordering is applied to vital events in a separate ordering channel so as to maintain their causal sequence and to avoid discarding updates which have yet to be applied. Hence, EyeCVE transmits critical events using a reliable and ordered channel and all non-critical events, such as the majority of body and eye tracking updates, unreliably.

To ensure concurrency of nonverbal signals such as eye, head and hand movement, EyeCVE bundles all updates into a single packet. Assembly and disassembly is done after and before the simulation cycle at the sending and receiving site respectively. Figure3.2illustrates how discrete eye, head, and hand tracking data is combined before transmission over the network, and applied to a remotely-rendered avatar. The data packet contains 8 floats for each head and hand tracking node describing 3D (x, y, and z) position, quaternion rotation (x, y, z, and w) and scale; two floats for the tracked vertical and horizontal eye angle; and 4 bytes overhead from the network layer for sequencing. Using floats with 64-bit precision (8 bytes), this leads to a payload of 148 bytes for a single avatar update message. With a typical maximum

3.1. EyeCVE System Overview 84

Figure 3.2: Bundling discrete eye and head and hand tracker data in one network message.

transmission unit (MTU) of 1500 bytes for the Internet, this fits easily into a single Ethernet packet, and critically, minimises the likelihood of delays caused by fragmentation and reassembly from intermediate underlying protocols. To reduce network traffic and thus the impact of buffer overload and computation on latency, EyeCVE only sends an update when a configurable spatial-temporal threshold has been reached. The use of spatial thresholds alone, such as typically used in dead reckoning, have been shown to induce potentially uncapped inconsistency [RMM+05].

Motion tracking is acquired and processed on two dedicated PC’s at each site: one for head and hand tracking, and another for eye tracking. This follows a common approach that decouples acquisition rate from render and simulation load, and provides the ability to filter unnecessary and outlying data from the client. For instance, as described in the following subsection, the initial eye tracker produced jitter and many erroneous outliers, which necessitated the use of a low pass filter. Therefore, the decoupled approach allows filtered data, rather than the original noisy data delivered from the eye tracker, to be transmitted across the local area network at a configurable update rate to the local client. Body tracking was performed using a VRCO TrackdTMserver coupled with InterSense IS-900 [Int10] head and hand sensor devices. The hand tracking device also enables navigation of the VE using a joystick and offers other interactions such as grasping virtual objects and triggering the eye tracker calibration procedure. The client reads and combines the latest tracking data from both PCs every simulation cycle.

In summary, EyeCVE’s system requirements of persistence and robustness as sites join and leave, server-side logging, sender side scalability, and ease of firewall maintenance have led to the use of a central server that relays updates and maintains a local replication of the object model. A problem with this approach which is ignored in favour of performance, is that updates are not synchronised across

3.1. EyeCVE System Overview 85 remote sites due to varying network characteristics. For example, updates for animating the head, hand, and eyes of an avatar that were sent out by site A may arrive marginally earlier at site B than at site C. Thus, users at sites B and C may perceive slightly different stimuli at a specific time. However, the approach to message bundling, which combines all body tracking data into a single update, ensures that users at sites B and C will view identical stimuli, albeit at approximately, rather than precisely, the same time. In any case, this is unlikely to have an operational impact on communication.

An alternative approach to synchronisation of temporal characteristics across all participating sites involves common reference to a centralised ‘time server’, with a programmed delay that is long enough to assure timely delivery on each site. However, due to the coupling of avatar head position and viewpoint rendering, this reduces responsiveness of the local system, as viewpoint updates from local head tracking can only be applied after they have been distributed to remote sites. Additionally, this approach cannot ignore the impact of differing delay induced by varying computation and projection technology between sites.