Domain Knowledge and Conversation Topics - A framework and evaluation of conversation agents

After a thorough examination of the logs of over 3,280 utterances, it was found that human- machine dialogues have discussed topics from every aspect of everyday life. These topics include emotion, love, sex, computers, entertainment, sport etc. As shown in Figure 6.4 and the illustration by VisualChat in Figure 6.6, several detailed topics have been discussed in every category. Almost 39.4% of the IM exchanges have discussed issues including friendship, sex and love. This finding is remarkable as the AINI conversation system is trained to simulate a human partner in IM. This was because the IM users, who are mostly young people, wanted to tell AINI some private issues and experience. Even in the dialogues, some of them praised AINI, invited “her” on a date, and some of them disclosed their personal challenges. About 17.7% invited AINI to talk about the robot technology of CA and some even tried to test AINI's intelligence by arguing with “her”, and some of them tried to cheat. It is likely that there are CA developers among this group of users and a number of them came to “know” AINI from the “Invasion of the Robots Contest” websites. There were 53 CA programmers competing in the contest and some of them realised that they were talking with a robot or a computer program after a short period of chatting with AINI.

Education 8.0% Health 2.2% Computer 15.4% Robot Technology 17.7% Emotions 39.4% Entertainment & sports 7.4% others 9.8%

1.1% 1.1% 0.6% 1.7% 9.8% 85.7% 0.0% 10.0% 20.0% 30.0% 40.0% 50.0% 60.0% 70.0% 80.0% 90.0% Natural Language Corpus FAQChat TREC Corpus MindPixel AAA Random Answer D o m a in -S p e c if ic O p e n -D o m a in U n a n s w e re d

Figure 6.5: Frequency of AINI’s Responses based on Domain Knowledge Bases

As discussed earlier in Chapter 3, AINI’s domain knowledge model incorporates several knowledge domains with the objective to give the users the best answer in a conversation. An analysis of the source of knowledge where the answers were extracted is shown in Figure 6.5. From AINI’s log with 1,721 utterances, AINI used 88.03% of the knowledge from Open-Domain knowledge bases and only 2.8% from Domain-Specific knowledge bases. As explained earlier that the experiment did not restrict the conversation to Domain-Specific knowledge on the SARS epidemic [111] or Bird Flu Pandemic [113] only. However, AINI’s knowledge domain was equipped with crisis communication knowledge bases which were included in the Natural Language Corpus and FAQs extracted from online documents using AKEA [156]. In terms of frequency of appearance, the two words, SARS and Bird Flu, occurred roughly equal:. These words appear to be rather specialised terms, used in a restricted number of conversations in contrast with other words. AINI’s “buddies” get to know the availability of these domain knowledge bases from AINI’s Crisis Communication

Network portal (CCNet) [117]. About 37 utterances are related to health domain question such as diagnose, treatment, symptoms, spread, protection, cause, vaccination and risk of the SARS epidemic or Bird Flu pandemic.

On the other hand, AINI responded with 85.7% of its conversation from the AAA’s knowledge bases. This is not a surprise as the AAA’s knowledge bases cover most of the common topics and knowledge including emotion, sex, literature, music, religion, science, sports, etc. More than 45,318 AAA stimulus-response categories are stored in AINI’s knowledge base. Each category contains a stimulus-response (also called input-pattern) and an output-template. Another common sense knowledge base is made up of AINI’s stimulus- response categories, which came from the TREC and MindPixel corpora. Although common sense stimulus-response categories cover almost half of AINI’s knowledge bases (49%), only 2.3% of the total responses are related to common sense questions. Despite the fact that common sense questions play a major role in formal conversation, AINI’s “buddies” are normally more interested in issues of daily life or personal interest, instead of the factoid questions that are provided in TREC and the MindPixel corpus.

AINI’s query engine works based on the natural language query: if a matching category is found in the knowledge bases, it will be retrieved and be transformed to the output. If no matching category is found, AINI’s query engine will send the request to the random response knowledge base, and a generic answer is generated dynamically. These replies sometimes may be inappropriate, amusing and thoughtless responses and comprised 9.82% of the total output of the IM conversation. Obviously, these expressions are irrelevant and unrelated and make AINI’s “buddies” feel irritated and confronted by AINI. These expressions occur because of the differences in manners of speech and speech acts (e.g. declarative, interrogative or imperative or exclamatory). This is because IM human users have a tendency to use shorthand, acronyms, abbreviations and emoticons (see Chapter 7).

Unfortunately, AINI was not trained to understand such expressions in the short period of time in which this study was conducted. However, AINI is capable of learning from domain experts through the Supervised Learning module (discussed in section 3.5.4.7). The unanswered questions will be maintained separately by a domain expert or ‘botmaster’ who will keep AINI’s knowledge bases updated regularly. The domain model has been designed in such a way to make sure in subsequent sessions of conversation, AINI will ‘understand’, and should be able to participate in a meaningful conversation in the future.

In document A framework and evaluation of conversation agents (Page 190-193)