4.4 Pattern Model
4.4.2 Pattern Matching Example
Figure 4.14 shows a pattern example and a matching instance beneath. The type of the root object is set to SetQuestion and the result of a matching instance is assigned to the variable commFunction. The content of the multi-slot feature reference must contain one element of the type ReferenceModel whose feature resolved is a boolean value and restricted to the content true. Additionally, the ReferenceModel must contain one semanticContent of the type Movie. The id of the movie must be set but is not restricted to a value. The content of it should be assigned to the variable movie id. Furthermore, the feature knowledgeItem of the SetQuestion must equal the string “movie.trailer”. The given instance matches all these restrictions. After the match process, the complete SetQuestion-instance is assigned to the variable commFunction. The movie id variable will be set to the value “13”.
SetQuestion reference ReferenceModel resolved true semanticContent Movie id “13” name ”IronMan3” knowldegeItem “movie.trailer”
4.5 Summary 101
4.5 Summary
This chapter dealt with the first research question:
1. Modelling Language: Which requirements must be fulfilled by a meta-modelling language that is used for the declarative development of multimodal dialogue appli- cations?
First, the chapter described approaches for the semantic representation of knowledge and briefly discussed their expressiveness and qualification for use in a multimodal dialogue system. In a requirement analysis, we derived a set of features for the meta-modelling language which is used in SiAM.
In the next part the finally used Eclipse Modelling Framework (EMF) was introduced and it is argued why this framework fulfills the previously mentioned requirements. Additionally, several API extensions and their implementations are described. This comprises several algorithms for unification and overlay that turned out to be very valuable for knowledge processing in multimodal dialogue systems. Furthermore, the framework needed an additional functionality for cloning model instances.
The standard modelling solution only allows one to specify static content during design time. To overcome this restriction, the following section introduced the bindable concept which allows developers to define instances whose content is dynamically evaluated from script expressions during runtime, taking into account the current context.
Pattern matching is used throughout the dialogue platform in order to define semantic constraints on instances. The final part of the chapter introduced a pattern model which allows one to define patterns for EMF instances. Pattern matching in SiAM-dp respects type-hierarchies, multi-slots and can specify arbitrary functions that are used for the validation of content.
5
Massive Multimodality in Cyber-Physical Environments
5.1 Introduction
We introduced the term Cyber-physical Environment (CPE) in Section 2.4. One feature of a CPE is the high number of devices that are spread throughout the surrounding environment but are part of an extensive network in the Internet of Things. From the dialogue platform’s point of view, all devices in the environment are possible input and output devices for the realisation of the interaction between users and the environment. Input devices can be either controllers that serve the user as devices for multimodal input or sensors that allow to recognise non-intrusive activities in the environment. Output devices can be renderers that are part of a multimodal output representation or actuators that are directly controlled by the dialogue platform as part of the intelligent environment.
In the CPE a multimodal dialogue system must be able to handle a great number of devices and modalities. A human in the environment is not bound to one specific computer or control interface anymore. In fact, he interacts inside the environment, he changes his position in the environment, and may switch the applied control interfaces in order to conduct changes to the environment. Thus, in the perception of the human, the interaction with the various interfaces blurs into an interaction with the environment independent from the currently involved devices of the CPE.
From this new interaction paradigm, new requirements for the usability of the system arise. From the perspective of the user, the heterogeneous set of devices and modalities must be smoothly integrated, even if the user moves inside the environment. Different situations or users may pose diverse demands to the applied devices and modalities. Thus, a free choice of modality and the arbitrary combination of modalities are imper- ative for the dialogue platform. The following requirements must be considered:
• Dynamic reconfiguration of the set of available devices • Adaptation to device failure or unavailability
• High heterogeneity of input and output devices
• Modality independent representation of user and system intentions
These requirements are not fulfilled by the multimodal dialogue systems presented in the related work in Chapter 3. Here at most three modalities are integrated into one system and the set of modalities is fixed. Also, current commercial systems like Apple’s Siri or modern in-car systems concentrate on two modalities, mostly GUI and speech, sometimes also gestures. This makes a major difference in the requirements of multimodal platforms in heavily instrumented environments that try to capture all human senses. The platform must be flexible enough to integrate a massive heterogeneous set of devices and modalities concurrently that can dynamically change during runtime. In addition, a great variety of actuators and sensors must be considered. We use the term massively multimodality to capture the extreme variety in input and output modalities.
The challenge to a massively multimodal system on a technical level is to support a uniform interface for devices that encapsulate the heterogeneity in protocols and tech- nologies. For a better understanding of the heterogeneity, we first classify possible devices into several categories (Section 5.2). For the communication between devices and the platform, a common language is required that is introduced in Sections 5.3 and 5.4. The following sections deal with the semantic representation of the meaning behind an interaction (Section 5.5). An important task here is the conversion from a pure syntactic representation of input and output to a semantic level and back. This issue is discussed in Section 5.6. Here a generic rule-based approach is introduced that shifts content from the syntactic to semantic level or vice versa. This allows the fast and easy integration of arbitrary devices by the declaration of mapping rules.