The Difference Between One Modality and Multiple Modalities

Chapter 5: An Analysis of the Methodology

C. The Difference Between One Modality and Multiple Modalities

The subjects in the 11 experiments perceived the synthetic robot expressions and the contexts through the same visual channel or through multi-modality, depending on what the context was. For example, when the surrounding context was the affective pictures, the perception was in the same visual channel; when the surrounding context was the recorded BBC News or the classical music, the perception was in the multi-modality that of visual-audio; and when the surrounding context was the film clips, differently, the perception was in the multi-modality that of visual-audio combined with video.

As speculated in Chapter 4, perceiving emotional cues through the same visual channel should be easier than perceiving them through multi- modality. However, such speculation was not on based on empirical data. This is because contextual effects were observed not only when subjects perceived the emotional cues from the same channels (e.g., visual channel (still images) with visual channel (robot face) in Image1 and Image2), but also when they perceived them from two channels (e.g., visual-audio in Speech1, Speech2, Music1 and Music2). The fact that no contextual effect was found for the film clips when the perception was in the multi-modality, might support such speculation.

Nevertheless, care should be taken when conducting experiments involving perceiving emotional cues through multi-modality. It should be made sure that participants in such experiments pay enough attention to the robot head as well as the context.

5.2.2 The Rating Scheme

A forced choice rating scheme (e.g., question 1 and 2 described in the response section) was used in all the experiments to rate not only the sequences of

robotic expressions but also the emotional materials. Using a forced choice rating scheme to rate the sequences of robotic expressions was inspired by The Media Equation (Reeves and Nass, 1996). According to The Media Equation, good versus bad is a primary evaluation of mediated experience, thus the evaluation of Positive/Negative applies to media, and Bartneck et al. (2005) extended the ―Media‖ to robots, since they often have an anthropomorphic embodiment and human-like behavior. In human face and avatar face studies (e.g, Noël et al. (2009)‘ study), a single facial expression was often paired with a voice to be presented to the subjects within a very short time period, e.g., a few seconds. In the studies presented in this thesis, each sequence of facial expressions was about three minutes long: almost the same time length as each type of the surrounding contexts. Therefore, by using a forced choice rating scheme, subjects‘ total impression after the long period presentation of the robot face and the context can be evaluated.

The studies presented in this thesis also benefited from using a forced choice rating scheme to rate the emotional materials. In the previous 11 experiments, subjects were only asked to distinguish positive materials from negative materials, which could probably result in a fair accessibility of emotion categories for both the robotic facial expressions and the emotional surrounding contexts. This is because, Niedenthal et al. (2006b) speculated that, as a result of people‘s insufficient experience of categorizing situations in terms of emotion categories, facial expressions are easier and faster to categorize in terms of discrete emotions than descriptions of situations. As an example, Fernandez-Dols et al.‘ (1991) experimental result supported their prediction that if participants were trained to categorize situations using simple emotion terms, then the weight of context in the judgment of emotional experience should increase.

At the same time, a forced choice rating scheme might push subjects‘ responses in a particular direction. It might be that contextual effects are found here precisely because a forced choice rating scheme was used. Future research could make use of a more flexible rating scheme to rate the sequences of

robotic expressions (e.g., the Geneva Emotion Wheel), to see whether the findings in this paper also apply to some subtle facial expressions. At the same time, more emotion categories can be used to describe the emotional materials in the rating scheme for contexts, although this would be challenging.

5.2.3 The Mood Effect

As described in Chapter 4, it was evident that both the 3mins long classical music and the film clips were observed to have strong influence on subjects‘ mood states while the robot expressions were observed to have no such mood effect. And as mentioned before, although no statistical evidence was collected, it can be indicated that from Fig.5.1, the News recordings were unlikely to induce strong moods in the subjects. In contrast, the affective pictures may have had the ability to color subjects‘ mood states, judging by the strength of the emotional valence of each material. A conclusion can be drawn that different kinds of context may have different abilities to induce moods in subjects.

In some cases, the accompanying or preceding contexts could affect subjects‘ mood states, which suggested subjects‘ mood states could also play a role in coloring the contextual effects. However, as argued in the last section of Chapter 4, a mood effect cannot be considered to be a sufficient factor for a contextual effect. At the same time, it remains unclear whether or not a mood effect is necessary for obtaining a contextual effect. Future experiments could control subjects‘ mood states to see if a contextual effect can occur in the absence of a mood effect. If it did, it can be concluded that a mood effect is not necessary in obtaining a contextual effect.

After a context is selected, and the convincing facial expressions of a robot are created, care should be taken over how to present a robot face and a context (i.e., simultaneously or separately, on a first viewing or second viewing of the robot face, and whether subjects‘ have or have not seen the robot expressions before). Whether or not a contextual effect will be observed depends on the manner of presentation.

In document Contextual recognition of robot emotions (Page 196-199)