6.3 Qualitative Evaluation
6.3.3 Expectations and Disappointment
A rather small group of participants stood out by writing extremely detailed messages with three up to ten sentences per message. Due to its shallow natural language understanding capability, VPINOwas not able to respond properly to detailed illustration of participants’ problems. As a result, these participants got disappointed and demotivated. Some of the users accepted the limitations of the system and the conversation still led to positive results, while other users, although originally motivated, stopped their efforts on continuing the dialogue in a serious and motivated way. Once these users encountered critical situations where VPINO could not meet their expectations, they radically changed their answering behaviour to minimal answers or even stopped the conversation.
On the other hand, in many of the almost human-like conversations, VPINO was able
to surprise or impress the user. A recurring example for impressed users is successful op- tion reference resolution. Another example are particular clever pro-actively formulated responses by VPINO, resulting from precisely formulated questions and correctly antici- pated user reactions.
6.4
Discussion and Conclusions
Rational decision-making support with VPINOcould effectively help users with their de- cision problem. We could prove our assumption that detailed understanding is not neces- sarily required for the task of decision coaching, yet hold a human-like conversation. With our study we could show that highly cooperated, motivated, and serious user had the highest rate of success. Besides, users with a more hypervigilant, less systematic strategy on making decisions particularly profit from using the system.
User expectations on the intelligence, behaviour and natural language understanding capabilities of VPINOseem to have an influence on user acceptance and effectiveness of the
conversation. While intelligent behaviour by VPINOcould impress users and motivate, a lack of intelligent behaviour leads to decreased user acceptance, and therefore less coop- erative users. Whereas classical chatbots try to overcome this problem with obfuscation tricks to simulate intelligence/cleverness, (for example switching the topic or making a joke), systems for professional use do not have that option.
For the professional scenario of decision coaching, future work will require further im- provements of solving intelligent sub-tasks that are relevant for the ongoing conversation and are helpful/have a value for the client. Furthermore, future work needs to evaluate the usefulness of rational decision support with VPINO on a more broad target group of users.
Rational Decision Coaching: Follow-up Study
We conducted a follow-up study on rational decision coaching with an improved version of VPINO. The goal of this study was to gain further insights by focusing on a broader target group with respect to age, gender distribution, and educational background. Furthermore, we examined the effects on the participant’s emotional assessment of their specific decision problem. For the user study, the set of sub-dialogues was optimized and particular natural language understanding tasks were improved.
This chapter is structured as follows: The motivation for this user-study and a brief description of the improvements on VPINOis presented in Sect. 7.1. The setting of the user-
study is described in Sect. 7.2. The results of the study are presented in Sect. 7.3, while user feedback is presented in Sect. 7.4. The chapter closes with a discussion of the findings and conclusions in Sect. 7.5.
7.1
Motivation and Improvements on Vpino
VPINOis intended as a highly available coach to support a large number of users with their
problem. In order to evaluate VPINO’s effectiveness as a decision coach on a broader target group of participants, we conducted a follow-up user-study. Therefore, the participants were recruited in public places, i.e. a shopping mall and a major train station.
VPINOwas improved based on the insights gained from the previous study on decision coaching (Chapt. 5). Therefore, the user feedback and the conversation transcripts from the first study were evaluated. We identified particular coaching questions/sub-dialogues that did not provide additional value for a majority of users. User feedback revealed that most participants valued the reflection part of the conversation more than a detailed discussion of the goals and a theoretical optimal solution in the problem framing phase at the be- ginning of the conversations. Furthermore, some participants criticized the length of the conversation before getting to an interesting part, which led to shorter and less reflected responses in the important reflection phase. To straighten the conversation and make it a bit shorter, we removed a small subset of sub-dialogues from the problem framing phase. The removed coaching-questions were originally devoted to identify theoretically ideal so- lutions for the user problems, that are generally unrealistic. Additionaly, sub-dialogues on “changing the perspective to an external person” were removed, since these questions were often not considered helpful for the rational decision problems.
Besides optimization of the set of coaching questions, we improved the option refer- ence resolution mechanism, and developed a more elaborate strategy for pairwise option comparison (as presented in their final version in Sect. 5.6 and 5.5).
Furthermore, the improvements with VPINOincluded minor changes in formulations,
i.e. more precisely formulated questions and system responses, and slight improvements on particular DASM. Besides, the dialogue act classification rules were improved based on misclassified examples from the first phase.