Menu Prediction: A Minimum Description Length Approach
4.3. THE LEARNING SETTING 103
4.5.2 Two Users Discussion
The mean discrete loss and absolute loss exhibited by User 1 in Figure 4.2 and Figure 4.3 respectively shows that the MDL technique did not perform as well as the specified hypotheses technique. This is to be expected given that the specified hypotheses technique has available to it hypotheses that were hand- crafted to capture the possible concepts a user might adopt. However, with the exception of the Most Common Hypothesis approach in Figure 4.2, all the menu prediction approaches using the MDL technique performed better than the standardFirst Menu Item menu prediction approach for both users. The largest difference in terms of discrete and absolute loss between the MDL and specifed hypothesis techniques occured with the Most Common Hypothesis approach. The large difference between the two techniques is to be expected given the way in which theMost Common Hypothesis approach operates. TheMost Common Hypothesis approach performs best when there exists one hypothesis that is consistent with a majority of the training exam- ples. In the case of the MDL technique it is unlikely such a hypothesis will be found since the aim of the MDL technique is to learn individual concepts, which are often represented by short runs of training examples when concept drift exists. Although the MDL technique performs poorly with the Most Common Hypothesis approach, the difference between the two techniques is reduced in the Fixed Window approaches (using a window sizes of 2, 3, and 4) and the Most Recent Correct Hypothesis approach.
One reason for the difference between the MDL and the specified hypothe- ses techniques is the disadvantage the MDL technique has when compared to the specified hypotheses technique. The MDL technique has the disad- vantage of having to induce hypotheses before they can be applied for menu prediction. In effect, the MDL technique will never be able to apply a concept when it is first encountered, and must therefore rely on concepts recurring in the sequence of training examples. The training examples generated from User 1 did in fact exhibit recurring concepts. Figure 4.6 shows that out of the 35 hypotheses generated, 10 hypotheses were unique. Recurring concepts can also been seen for User 2 in Figure 4.7. Out of the 72 possible hypotheses generated, 14 hypotheses were unique.
The results presented for User 2, shown in Figure 4.4 and Figure 4.5 are quite different from those presented for User 1. The first difference can be seen in Figure 4.4. Unlike User 1, the performance of the MDL technique is slightly better than that of the specified hypotheses technique. If the dif- ference between the two techniques was significant, it would suggest that the MDL technique induced hypotheses that were more accurate than those specified. However, the difference in performance of the MDL technique and the specified hypotheses technique is not mirrored when absolute loss is mea- sured, as seen in Figure 4.5. One explanation for this difference is that on average a misprediction by the MDL technique resulted in more key presses (absolute loss) than a misprediction made by the specified hypotheses tech- nique. This could be explained by the MDL technique performing better in terms of discrete loss but worse in terms of absolute loss. To illustrate this point, we could imagine the case where the user’s actions could be modeled by relatively simplistic hypotheses. In such cases the MDL technique would be able to induce these simplistic hypotheses, and therefore predict the cor- rect menu item most of the time. However, in those situations where these simplistic hypotheses fail to predict the correct menu item, they may instead predict a menu item that results in a large number of key presses for the user (absolute loss). The problem is that the menu prediction approach has to choose a hypothesis to predict a menu-item. If the hypotheses available to it are poor, then in certain situations the menu-item predictions could result in an absolute loss greater than the loss of selecting menu items at random. This problem does not affect the specified hypotheses technique to the same degree. The specified hypotheses technique consists of hand-crafted hypotheses and, if chosen correctly, will contain hypotheses that accurately model both the simplistic and complex behaviour of a user.
4.5.3
Scenario
The data presented in Section 3.9.2 on page 90, was used to evaluate the menu prediction approaches using the MDL technique. As discussed in Sec- tion 3.9.2, the scenario-driven collection of data was used to compare the menu prediction approaches in a more controlled fashion.
4.5. EXPERIMENTAL EVALUATION 115
Data was gathered from five different users performing the same scenario in the Contacts application. The scenario asked each user to perform the following tasks in the NokiaTMContacts application:
1. Create 10 new contacts.
2. Assign four of the contacts to one group and six to another group. 3. Edit the four contacts who belong to a single group.
4. Send an SMS to the four contacts in a single group. 5. Delete two contacts.
6. Create one more contact and assign it to a group.
As noted in Section 3.9.2, the user interface allowed completely different ways to accomplish the same task, therefore these tasks could be accom- plished by different key press sequences.
Using the data collected from five users who performed the scenario, the effectiveness of the menu prediction approaches using the MDL technique was determined. Figure 4.8(a), 4.8(b), 4.8(c), 4.8(d), and 4.8(e) compares the mean discrete loss between the MDL technique and the specified hypotheses technique for each user. Figure 4.9(a), 4.9(b), 4.9(c), 4.9(d), and 4.9(e) shows the mean absolute loss for each user.
Figure 4.10(a), Figure 4.10(b), Figure 4.10(c), Figure 4.10(d) and Fig- ure 4.10(e) shows the codelength of the hypotheses produced after every example for the five users. In addition, the size of the restricted hypothesis space used by the menu prediction approaches is overlayed on these graphs. These graphs show how the codelength of successive hypotheses change over a series of examples and also how the restricted hypothesis space grows over a series of examples.