Dialogue manager - Dialogue system components

3.4 Dialogue system components

3.4.5 Dialogue manager

The dialogue manager is the component in a dialogue system that is re- sponsible for the state and the flow of the conversation and coordinates the activity of the subcomponents in a dialogue system.

A dialogue manager can be implemented in various ways, and different typologies exist. One such categorization[47] discerns finite state-based systems, frame-based systems and agent-based systems. A more fine-grained categorization[48] classifies systems as finite state-based, frame-based, sets of contexts, plan-based models and agent-based. These categorizations clas- sify systems based on the dialogue phenomena handled and go from least to most complex.

Finite state- (or graph-) based systems

In these systems the flow of the dialogue is specified as a set of dialogue states with transitions denoting various alternative paths through a dialogue graph. At each state the system produces prompts, recognizes (or rejects)

specific words and phrases in response to the prompt, and produces actions based on the recognized response.

The following is an example of an interaction with a basic finite state- based system in which the system verifies the user’s input at each state of the dialogue:

System: What is your destination? User: London.

System: Was that London? User: Yes.

System: What day do you want to travel? User: Friday.

System: Was that Sunday? User: No.

System: What day do you want to travel?

The dialogue states and their transitions must be designed in advance. The previous interaction could be controlled by a finite state automaton like the one in figure 3.2.

What is your destination?

Was that hdesti?

What day do you want to travel?

Was that hdayi? hdesti No Yes hdayi No Yes

Figure 3.2: A partial finite-state automaton architecture for a dialogue manager

An example of a system using this approach is the Nuance automatic banking system[47].

Frame-based systems

Frame- (or template-) based systems ask the user questions that enable the system to fill slots in a template in order to perform a task such as providing train timetable information. In this type of system the dialogue flow isn’t fixed but depends on the content of the user input and the information that the system has to elicit.

Consider the following example: System: What is your destination? User: London.

System: What day do you want to travel? User: Friday.

If the user provides one item of information at a time then the system performs rather like a state-based system. In a frame-based system the user may provide more than the requested information.

System: What is your destination?

User: London on Friday around 10 in the morning. System: I have the following connection...

In the second example, the system accepts the extra information and checks if any additional items of information are required before querying the database for a connection.

The context is fixed in these systems for they only do one thing. The context can be seen as being represented as a set of parameters that need to be set before the system action can be executed. For example, before a system can provide information about train arrivals and departures, the systems needs to know parameters like the day of travel, time of departure, and so on. The action is performed as soon as enough information has been gathered.

This approach has been used in systems that provide information about movies[49], train schedules[50] and the weather[51]. The advantage of the simplicity of these domains is that it is possible to build very robust dialogue systems. One doesn’t need to obtain full linguistic analyses of the user input. For example, given the utterance When does the Niagara Bullet leave Rochester? the parameters htraini (The Niagara Bullet), heventi (leaving), and hlocationi (Rochester) can easily by extracted using simple pattern- matching.

Sets of contexts

The frame-based approach can be extended with the concept of multiple contexts. For example, a simple travel booking agent may be used to book

a series of travel segments. Each travel segment can be represented by a context where each context holds the information about one travel leg and is represented using the frame-based approach. With multiple contexts the system is able to recognize when the user switches context. It may also be able to identify cases where a user goes back to modify a previously discussed context. An example of a system that uses this approach is the DARPA Communicator project[12].

Plan-based models

Plan-based approaches are based on the view that humans communicate to achieve goals. These goals include the objective of changing the mental state of the listener. Plan-based theories of communicative action and dialogue[52, 53, 54] claim that the user input is not just to be seen as a sequence of words but as performing speech acts to achieve certain goals. The task of the listener is to infer and appropriately respond to the speaker’s underlying plan.

For example, in response to a customer’s question of ”Where are the steaks you advertised?”, the butcher might reply with ”How many do you want?. This is an appropriate response because the butcher recognizes the underlying plan of the customer[55].

A system that is based on this approach is TRIPS[56] which in turn is based on TRAINS[57].

Agent-based systems

In agent-based systems a conversation is seen as an interaction between two agents. Each agent is capable of reasoning about its own actions and beliefs, and about the actions and beliefs of the other agent.

These systems tend to be mixed initiative, which means that both the user and the system can take control of the dialogue by introducing new topics. Both agents work together to achieve mutual understanding of the dialogue, using discourse phenomena such as confirmation and clarification. Agent-based systems focus on collaboration and are therefore able to deal with more complex dialogues that involve problem solving, negotiation, and so on. However, the approach does require more computing resources and processing power than other approaches.

Agent-based systems may use the Beliefs-Desires-Intentions (BDI) agent architecture[58] to model its internal state. The BDI-model has been extended for use in dialogue systems to also model mutual beliefs[59], i.e. what both agents belief to be true.

In document A Chatbot Dialogue Manager - Chatbots and Dialogue Systems: A Hybrid Approach (Page 30-34)