Wizard of Oz and Agents - Areas of Application

2.2 Areas of Application

2.2.3 Wizard of Oz and Agents

One of the main applications for natural-language based human-computer interaction is the use of agents and different sorts of intelligent advisers. At the end of the 1980’s Hill and Miller [1988] used a human to play the role of a simulated intelligent advisory system. The goal was to test different strategies of how to give advice on the use of a statistical software package. Looking at how users of such a program would formulate questions, Guindon [1991]

found that the generation of simple and rather restricted questions was predominant. Similarly, Pilkington [1992] used WOZ in a text-to-text fashion to provide help to test participants using the UNIX text editorVI. Later, trying to build a model for a professional advisor system, Hill [1993] found that on the one hand, the advice that is given depends on the given or pre-assumed knowledge the wizard has over the knowledge of the advice seeker, and on the other hand, that even if the given advice solves a problem, approximately half of the advice seekers do not use it in an effective and efficient way.

One important aspect of intelligent agents is their potential ability to learn from their interactions with a person. Maulsby et al. [1993] used WOZ to look at this learning process by exploring how people would teach their virtual helper. How pedagogical agents can be used in a virtual learning environment and its influence on group-work was the motivation for a WOZ study conducted by Jondahl and Mørch [2002], and in particular looking at when and how an intelligent advisory system should present feedback Mavrikis and Gutierrez-Santos [2010] found that contextual information plays a crucial role in effective advice giving. They com- pared three levels of simulation. First, they had the advice giver (facilitator) sitting next to the students. Using this set-up students and facilitator shared the same information space. In a sec- ond stage, the facilitator was located in a different room and used a remote desktop system (i.e. open WOZ set-up) to communicate with the students. In this set-up many communicational cues such as the student’s face or his/her gestures were lost. Finally, Mavrikis and Gutierrez- Santos used a fully automated advice system that was built upon the results of the previous experiment stages to give advice. They concluded that contextual information is of great im- portance if one aims at giving effective advice and consequently optimises a student’s learning progress. WOZ is an efficient method for building the relevant computer models supporting this agent behaviour (Mavrikis et al. [2012]; Rizzo et al. [2005]). Similarly, Tsovaltzi et al. [2008] and Braun and Rummel [2010] argue that an efficient tutoring system needs to adapt to a students progress. That is, adaptive and context-specific feedback is necessary to direct a learner’s interactions with an interactive learning environment to useful ends. Also here both research teams employed a human wizard to learn the relevant cues.

Searching for improvements on a different front Kitamura and Tsujimoto [2003] found that if they use several agents in a simulated information retrieval task, they could broaden people’s interests in the topic. Conducting a simulated proof-of-concept study that integrated agent- like conversational abilities into a computer game character, Gustafson et al. [2005] argue that spoken dialogue technology has the potential to enrich a user’s experience. Also Dow et al. [2010] showed that blending human and machine control can lead to a richer playing experience.

An additional area of agent research for which WOZ prototyping has seen an increased application is the study of emotional aspects of human-computer interaction. Bradley et al. [2009], for example, used a simulated agent to engage people in a discussion over their personal photo library. The goal was to build a more human-like connection between a user and a virtual character; something for which WOZ simulation seems particularly suitable. On the

other hand, Chen et al. [2010] looked at the difference between introverted and extroverted agents and found that those personality traits influence how users interact with the system, leading to more talkative behaviour with an extroverted version. Again a WOZ experiment was used to control the different modes. Similarly an agent’s smiling and other facial cues have been explored. Bevacqua et al. [2010a] and Bevacqua et al. [2010b] report on users partially reflecting a virtual interlocutor’s behaviour. Finally, blending on-task dialogue with social conversation (i.e. off-task dialogue), Silvervarg and J¨onsson [2010] showed a positive effect on pupils’ attitudes and self-efficacy when studying mathematics, and testing new ways for interacting with emotion, Andersson et al. [2002] and Paiva et al. [2002] used a doll to explore emotional input for computer games. Here a wizard interpreted the expressed emotion (judging if the expression was understood based on an instruction sheet of possible expressions) and appropriately controlled the game character. Scherer and Schwenker [2008] on the other hand presented work that extracted emotions from recorded speech. Again, WOZ was used to trigger those dialogues and collect the relevant corpus. Finally, expanding on this, Walter et al. [2010] report on using WOZ experiments for building emotional profiles of users.

Lessons Learned

Based on the discussed literature it can be argued that WOZ prototyping of agents is essentially an extension of what has earlier been categorized as WOZ prototyping for natural language. The main difference is its strong focus on simulating context-aware systems (e.g. Pilkington [1992]), which makes it more difficult to pre-define and control responses. In more recent years experiments were furthermore augmented by a visual interpretation of the simulated agent, turning them into an audio-visual interaction scenario. While this may not influence the wizard perspective of an experiment, it significantly changes the interaction experience for a test par- ticipant. Hence, similar to multi-modal interaction scenarios, recent agent-based experiments have usually a strong focus on user experience and how it can be influenced (e.g. Dow et al. [2010]; Chen et al. [2010]). In some cases this can lead to an increased workload for the wizard. For example, a wizard is sometimes not only responsible for choosing appropriate responses but also for the facial expression and/or gesture output with which the agent delivers them. Hence, wizard tools for this type of experiment need to offer appropriate features that help controlling both, the audio and the video channel. In summary, WOZ prototyping of agents can be seen as a hybrid that shares the goals and challenges of simulating language-based interactions scenarios (e.g. collecting language corpora) and combines them with some aspects of multi-modality (e.g. coherence of audio-visual output).

In document Supporting Wizard of Oz experimentation for language technology applications (Page 32-34)