Chapter 8 Evaluation Results
8.3 Empirical Evaluation
8.3.1 Experiment 1: To investigate the effectiveness of the user-controlled
adaptation provided by the MDL concept as applied in the IPNS prototype in comparison to navigation without the presence of personalised tools
Task: The users in Group 2.1 and Group 2.2 were asked to generally explore the designated system and use the system to answer nine questions in approximately fifteen minutes. The users were requested to write down the time when they started and finished. Then, both groups were required to repeat the task but this time working with the different system. Table 8-1 shows the task allocation for both groups of subjects.
Appendix B (III-a: page 194) and (III-b: Q3: page 196) document the tasks for Group 2.1, and Appendix B (IV-a: page 201) and (IV-b: Q3: page 204) present the tasks for Group 2.2.
Experimental conditions Group
Condition 1 Condition 2
Group 2.1 System Non link
(Non-personalised system)
IPNS
Group 2.2 IPNS System Non link
(Non-personalised system) Table 8-1: The allocation of subjects for Experiment 1
Independent variable: System Non link and IPNS. Dependent variable: Percentage of task completed. Hypothesis 1:
H1: The percentage of task completed is significantly improved by the set of
links presented by the IPNS prototype in comparison to navigation without the presence of personalised features.
H0: The percentage of task completed is not significantly improved by the set of links presented by the IPNS prototype in comparison to navigation without the presence of personalised features.
Result for Experiment 1:
The percentage of task completed was defined as the overall percentage of task completion that took account of the time a user used to finish the task (time), the number of questions that the user completed (completion), and the number of questions that the user got them right (score). The percentage of task completed was obtained from the sum of the following measurements:
• Time percentage (maximum 100%); • Completion percentage (maximum 100%); • Score percentage (maximum 100%); giving the potential for a maximum of 300%.
Time percentage was calculated based on the time allowance for a user to complete the task (i.e.15 minutes), meaning that if a user finished answering the questions within less than 15 minutes, the user would then obtain 100 percent. The longer than 15 minutes the user spent on completing the task, the lower percentage the user would attain. Table 8-2 describes the calculation of the time percentage.
Time (min) Percentage
< 15 100
16-19 90
20-34 80
> 35 70
Table 8-2: The calculation of Time percentage
Completion percentage was defined as the percentage of the number of questions that a user completed out of nine questions, that is, if a user completed all nine questions, the user would gain 100 percent. The less questions the user completed, the lower proportion of percentage reduced from 100 percent the user would achieve. Table 8-3 demonstrates the calculation of the completion percentage.
No. of Questions Completion Percentage
9 100 8 88.89 7 77.78 6 66.67 5 55.55 4 44.44 3 33.33 2 22.22 1 11.11
Table 8-3: The calculation of Completion percentage
Similarly, Score percentage was calculated in the same manner as the completion percentage. However, the number of questions in this case was the number of questions that the user answered correctly. If a user answered all nine questions accurately, the user would obtain 100 percent. The more questions the users answered correctly, the higher percentage the user would obtain. Table 3 shows the calculation of the Score percentage.
No. of Questions Score Percentage 9 100 8 88.89 7 77.78 6 66.67 5 55.55 4 44.44 3 33.33 2 22.22 1 11.11
Table 8-4: The calculation of Score percentage
The calculation of the percentage of task completed for each subject can be found in Appendix C (II: page 216).
Based on the ‘one-tailed related t test (paired-samples t test)’– a model used to compare two different means of a repeated measure design (i.e. same subjects are doing both conditions)(Greene and D’Oliveira, 1999; Field, 2005) – Table 8-5 illustrates the descriptive statistics for the two systems, and Table 8-6 shows that the null hypothesis for Hypothesis 1 was rejected (t(15) = -3.329, p= 0.005/2 (1-tailed), i.e. p<0.05). This revealed that the set of links presented by the IPNS prototype has significantly improved the percentage of task completed in comparison to navigation without the presence of personalised features. Figure 8-3 exhibits the mean difference of the percentage of task completed in the graphical form. Appendix C (II: page 216) documents the trial data for the experiment 1.
Percentage of task completed Mean No. of subjects
Std. Deviation Std. Error Mean System Non link
(Non-personalised System) 222.64 16 30.94 7.736
IPNS 244.79 16 30.97 7.743
Table 8-5: Descriptive statistics for the two systems in Experiment 1 produced by SPSS
Paired Differences (95% Confidence Interval) Percentage of task completed
difference Mean Std. Deviation Std. Error Mean T df Sig (2-tailed) Non-personalised system – IPNS -22.15 26.62 6.65482 -3.329 15 0.005
Comparison of the percentage of task completed between two systems
244.79 222.64 0 50 100 150 200 250 300
System Non link IPNS
Experimental conditions P e rc e n ta g e o f ta s k co m p le ted Mean
Figure 8-3: Percentage of task completed between non-personalised system and IPNS
Comments:
The original number of subjects (N) was actually twenty four. However, the trial on the first day which included eight respondents (four for each group) had to be taken out from the data analysis which therefore resulted in sixteen subjects instead of twenty four for the analysis of data. This was due to the fact that the task asked the subjects to find information for the nine questions but did not guide the users where to find the information (e.g. look in Carbohydrate >> Starches). As a result, the users who did not have the knowledge background about the subject domain had to go through every single page and this might not provide the answer for what we were looking for (percentage of task completed in approximate time limit). At the end of the session, the respondents gave some feedback regarding this issue. In order to reflect on these comments, the second trial eliminated this problem by suggesting the location where the subjects would find the answers for the questions but the subjects still had to look through and locate the materials for the questions themselves.