Measuring Performance with Simulated Work Tasks

CHAPTER II: LITERATURE REVIEW

2.4. Engagement

2.4.4. Simulated Work Tasks

2.4.4.7. Measuring Performance with Simulated Work Tasks

simulated work-tasks and search behavior often have a common methodological approach. These studies often incorporate objective measures such as search interaction measures, and subjective self-report measures. There is also usually some measure of cognitive strategy,

most often think-aloud protocol, but interviews may also be employed for the same purpose. The effect of this methodological triad is that these studies can present a multifaceted picture of search behavior: search behavior, the motivations behind it, and evidence of a particular search strategy.

Li and Hu (2013) used both simulated and real work tasks to evaluate the usefulness digital library. Li and Hu do not describe how they created their simulated work tasks. They also gave participants criteria for the real tasks they were required to bring in, specifying that the task should have been part of a recent class assignment. Participants completed pre and post-task questionnaires, as well as an evaluation questionnaire which assessed items such as search skills and performance. Li and Hu found significant differences in topic familiarity and search experience between simulated and natural tasks, in that participants were more familiar and experienced with their own tasks. However, there were no significant

differences in other aspects of the task, such as complexity, urgency, and difficulty in making relevance judgments. This indicates that participants did not feel more urgency in their own tasks, though Li and Hu claim that it is “obvious” that the real task would seem less complex and more urgent than the simulated task. One important significant difference was the

difference in knowledge of task procedure; participants felt that they had less knowledge of the procedure to complete the simulated task than the real task. This is related to Svarre and Lykke’s (2014) identification of structural knowledge as important to how people perceive and complete simulated tasks. Participants reported low ability to predict the difficulty of the real task, and found it harder after searching. They also felt low ability to predict the

difficulty of the simulated task, but they found it easier after searching. Participants submitted more queries for real tasks, but viewed more search results pages, downloaded

more documents, and had slightly longer queries for simulated tasks than real tasks.

Participants also felt more success, frustration, and satisfaction with real tasks than simulated tasks, but these differences were not significant. However, Li and Hu found that feelings of success were significantly correlated with confidence and perceptions of task complexity and topic familiarity for the real task, i.e., participants felt more success if they were familiar with the topic. For the simulated task, satisfaction was significantly correlated with task difficulty and knowledge of task procedure, again illustrating the effect of structural knowledge on the ability to complete the task. Though the findings from this work are mixed, and many results are not significant, it does present some evidence for differences in perception and search behavior during simulated tasks. This challenges the idea that they represent a perfect compromise between effective system evaluation and realistic information needs.

Poddar and Ruthven (2010) investigated how natural and assigned tasks affect the emotional aspects of the search experience. Participants were given three different assigned: factual, complex, and exploratory. Participants were also asked to bring in their own task, which was not restricted by any criteria. Verbal utterances, observed actions, and

questionnaire data were used in this evaluation. Generally, participants felt that the simulated tasks were less interesting than their own task, but there were no significant differences between the factual task and the participants’ own tasks, suggesting that, at least in structure and content, the genuine tasks may have been most similar to the factual task. There were more positive emotions present before and after the natural task than the assigned tasks. Also, participants tended to bring in tasks similar to ones they had completed before, and ones they had topical knowledge on. This could explain the tendency to estimate lower task difficulty for natural tasks. Participants also used more search strategies, and expressed more positive

body language as well as greater confidence in their search for natural tasks. Participants had similar levels of interest in all of the assigned task types, but struggled to form queries with the exploratory task, and also struggled with deciding the steps to completing the complex task. This study showed that task source can contribute to the emotional state of the

participant, which in turn affects their search behavior. Simulated work tasks should seek to emulate natural search tasks and produce similar emotions.

Borlund, Dreier and Byström (2012) conducted two studies comparing perceptions of time spent searching between simulated work tasks and natural tasks. In the first study, the researchers created three simulated work task situations, which were pilot tested, and then evaluated by means of questionnaires and relevance assessments as well as post-search interviews. Borlund et al. asked participants to bring in their own tasks and advised them that their information needs should be either verificative (checking a specific piece of

information), conscious topical (finding information about a familiar topic) or muddled topical (exploring an unknown topic). This framing illustrates one issue with eliciting natural tasks in IR evaluation: shaping. For comparison purposes, even natural information needs must be categorized and refined.

Borlund et al. found that most participants in the first study said that time spent searching for their own topics (and one simulated topic) was due to its interestingness, while participants in the second study said that interestingness contributed to time spent searching more for ‘conscious topical’ needs, followed by muddled topical and verificative. 55% of the participants in the first study said that time spent searching was an indicator of the simulated task involving a lot of information, while 67% of participants said they felt time spent searching on the verificative information need was a result of the topic having a little

information. Lastly, 86% of participants said time spent searching was an indicator of a simulated task being too easy, while 67% of participants in study 2 said time spent searching was due to the verificative information need being easy, versus 56% who said the muddled topical need was difficult. Overall, Borlund et al. found that interest was a main indicator of time spent searching, but given that a variety of reasons were explored in this study, interest cannot completely explain time spent searching.

In addition to the findings, this study offered the opportunity for comment on the methodological implications of simulated work tasks. In comparing the two studies, Borlund states that participants tended to search longer during simulated tasks, which could be an indication of “over-performance” in an attempt to please the researcher. Therefore, though simulated tasks are designed to mimic real information needs, there may necessarily always be a distinct difference in the behavior of simulated tasks because of the experimental context - naturalistic search tasks in an experimental setting still did not yield similar task completion times.

2.4.5. Search User Interfaces. The search user interface aids users “in the expression

In document Edwards_unc_0153D_15838.pdf (Page 64-68)