• No results found

Reward Balancing for Statistical Spoken Dialogue Systems using Multi objective Reinforcement Learning

N/A
N/A
Protected

Academic year: 2020

Share "Reward Balancing for Statistical Spoken Dialogue Systems using Multi objective Reinforcement Learning"

Copied!
6
0
0

Loading.... (view fulltext now)

Full text

Loading

Figure

table depicts the domain statistics with the number
Table 1: Task success rates (TSRs) and number ofturns after 4,000 training dialogues using a successreward of 20 (baseline) compared to the optimisedsuccess reward rw s
Figure 2: The task success rates (TSR, left axes) and dialogue length in number of turns (T, right axes)for all six domains comparing the baseline (rw s=20 w,=( 0., 5

References

Related documents

This paper presents a proof-of concept study for demonstrating the viability of building collaboration among multiple agents through standard Q learning algorithm embedded in

Unlike other scheduling schemes which use include criterion, this paper is a way ahead and addresses multiple criteria including load balance, quality of service, economic