3 UIAF system design
3.1 Stakeholders
3.2.2 User interaction coordination
The system should provide an application control metaphor, which decouples the multimodal user interaction or Modality Fusion from the application logic. It is widely accepted that users interact natiurally in a multimodal fashion when presented with applications that provide multimodal interaction [87]. As described earlier the requirements in this category are not focused on the standard multimodal user interaction mechanisms of intention recognition/task planning. Exchangeable modalities in different situations are the focus in this work. Further users should be able to use multimodal user interaction for system control (i.e. pause a multimedia presentation delivery session), in order to provide a way for users to control the multimodal user interface, behaviour. These are the requirements for user interaction coordination:
Input Stream recognition (REQ021) - Recognition of user input from different modalities (e.g. speech, gesture, button pressed) is one key characteristic of user interaction coordination. It has to be taken care that the interpretation elements are following the same semantic meaning. Time stamping of recognition events is another important aspect in case an actual fusion of several interaction events should be achieved. Recognition can also provide a low level of fusion capability, but still needs to feed into a consistent semantic model for further processing. In general the system must be able to support user interaction input stream recognition.
Fusion o f input streams (REQ024) - Multimodal integration or multimodal fusion of input streams as discussed in the related work is one of the main characteristics of multimodal user interface systems. Fusion is based on a so called fusion mechanism or algorithm. For flexible fusion a system should provide for a fusion framework which allows for different fusion algorithms. Further fusion should be designed to accept application specific fusion models. A general fusion model for the system control has to be taken into account, which allows the user control of the system behaviour, for example for manual presentation control (play, stop, and pause). Context information should be taken into account; this basically can be achieved by supporting the device and modality description challenge. For a better context-awareness fusion can also take other context parameters than direct user interaction into account, such as passive user input gathered from sensors (e.g. activity sensor). Active and passive user input should first be interpreted before being used in the multimodal integration to reduce processing overhead.
- Define application fusion model (REQ025) - This requirement relates to the capability of a Mobile application to define its own multimodal input model. The so called fusion model is provided to the fusion framework to perform multimodal integration on behalf of
the Mobile application, releasing any application from an own multimodal integration mechanism. With this kind of model based mechanism the design and development of Mobile multimodal applications is simplified. An application developer can choose which model suits the application best and only define the application specific command invocations. Therefore the reaction of the Mobile application towards a user interaction event.
The three main requirements for multimodal application control fulfil mostly the challenges on device and modality description (challenge 2) and multimodal application control (challenge 3). Table 2 shows the dependencies.
Challenges/ Requirements dependencies Challenge 2: Device and modality descriptions Challenge 3: Multimodal application control Input stream recognition
Defined input modality classes.
Defined recognition data model. Timing information of input received.
Fusion of input streams
Making the application input model
comparable with device descriptions model.
Defined input interpretation mechanisms based on application input and recognition data model. Provide distinctive system control.
Define application fusion model
Application input model for different fusion techniques.
Table 2: User Interaction coordination, relation between challenges and requirements
The research contributions related to this requirements category are a multimodal integration framework for application control and the definition of an application input model. Both are described in the UIAF functionalities Chapter 4 in the multimodal application control Section 4.3.
3.2.3 Multimedia presentation delivery
The system must provide an optimal way to decide and deliver rich multimedia presentations in ubiquitous environments. It should be able to support single multimedia contents, as well as more complex multimedia presentations. This category defines the following main requirements:
Output decision (REQ058) - An optimal selection of the media items in a multimedia presentation and the matching devices in the environment demands for an adaptation process incorporating a number of steps. Incoming multimedia presentation descriptions need to be parsed into an internal presentation data model, making it process able for the adaptation step. This internal data model should be as much a possible multimedia presentation and multimedia content format independent, allow handling different
formats. Secondly for each multimedia presentation a presentation schedule has to be established, in order to identify parallel multimedia contents or streams (e.g. video with subtitles). Flexible multimedia content characteristics against device capability comparison are further important aspects for distribution decision. After any decision with ambiguous delivery choices, in terms of available device/content choices, a user feedback mechanism should be available. This is a quality requirement related to usabihty. An optimal selection in future situations can be facilitated by capturing these user choices and using learning capabilities of a context-awareness component. The interfacing towards these components needs to be well defined. And finally the decision or presentation preparation part should provide device tailored individual presentation descriptions.
Content adaptation (REQ059) - Inevitably some contents may not suit the available
devices in the environments nor match with the context situation at hand (e.g. in-car news should be automatically delivered in speech). Therefore, content adaption is an important aspect in the presentation delivery process. Nevertheless, there are some specifics which content adaptation has to fulfil for optimal processing. For the decision itself not the actual transformation of content from one to content is the important aspect, but rather a clear view in which format, with which quality content can be provided. Feeding this in the adaptation and delivery process allows for better choice of situation and device matches.
- Device specific presentation generation (REQ055) - This high level requirement relates
to the tailoring of presentation descriptions to the specific devices. For each device an according presentation description, with the references to the contents needs to be provided. Time and structuring information of the original presentation has to be kept in this process.
The three main requirements for multimedia presentation delivery fulfil mostly the challenges on
mobility (challenge 1), device and modality descriptions (challenge 2), Multimedia presentation delivery (challenge 4) and Integration o f context information (challenge 7). Table 3 illustrates the
Challenges/ Requirements dependencies
Challenge 1 : Challenge 2:
Mobility Device and
modality descriptions Challenge 4: Multimedia presentation delivery Challenge 7: Integration of context information Output decision
Reacts to situation Match-ability of changes. Able to device descriptions invoke re-decision and content
descriptions. Defines quality based capability comparison mechanism. Define a complete adaptation and delivery process. React to multimedia delivery and context change triggers. Create internal
presentation structure and schedule table.
Take context information into account by situation injections. Integrate with learning mechanism. Content Adaptation Device specific presentation generation Provide list of alternative contents. Provide device specific presentation descriptions for device dependent content rendering.
Table 3; Multimedia presentation delivery, relation between challenges and requirements The research contributions related to this requirements category are the complete definition of the multimedia presentation delivery process. The details are described in the UIAF functionalities Chapter 4 in the multimedia presentation delivery Section 4.2.