This section describes the experiment in detail. The design of the experiment is first outlined. The various elements of the experiment are then covered, including data collection, participants and the tasks required to be performed. This section concludes with a detailed description of the entire experimental procedure. Ethics approval for this experiment was sought from the NMMU Ethics Clearance Committee who granted approval with the reference number H11-SCI-CS-008.
5.4.1 Experimental Design
The experiment focussed on a scenario with a specific goal to be achieved using Co- IMBRA. Teams of two participants followed a set task list to achieve this goal (Appendix E). They were given some basic instructions and a short demonstration of how to interact with the system. They were not told exactly how they were to work together or who was to perform which tasks. Performance metrics were collected by the system, including metrics that measured the degree of collaboration exhibited, the number of duplications of controls and the number of updates on controls. Additionally, the test was recorded using an overhead camera and the test moderator was present to make notes and comments. Following the test, a post-test questionnaire (Appendix F) was used to collect subjective metrics. Fifteen teams of two participants each were used for the evaluation (n=30).
5.4.2 Data Collection Methods
Three methods were used to collect the results of the experiment. These were system measured methods, observational methods and subjective responses.
System measured. Co-IMBRA includes mechanisms to collect data about the type of
interaction exhibited by the participants; some are a part of the prototype design and some were added purely for collecting data for this experiment. These mechanisms consist of a log of all interactions with timestamps, the files that are downloaded, notes and information added by the participants and software counters of types of collaborative interaction for statistical purposes.
Observation was employed by the test moderator, who was present during the user tests.
Additionally, the user tests were recorded by an overhead camera recording video and audio. The test moderator made notes about the types of interaction displayed as well as comments the users made during the test.
Subjective responses were made by the participants, who were required to complete a
background biographical questionnaire before the test and a post-test questionnaire, rating the system on various metrics. A section for comments was also provided encouraging the users to give qualitative feedback.
5.4.3 Metrics
It was necessary for the metrics to measure adequately both the effectiveness of the information retrieval and the degree to which the system supports collaboration. The following metrics were thus used:
Effectiveness. This was measured by task completion rate, i.e. the proportion of tasks successfully completed by the team.
Efficiency. This was measured by time on task for successfully completed tasks. Collaboration style. This was a subjective classification assigned by the test
moderator into one of three types of collaboration.
Collaboration rating. This was a subjective rating assigned by the test moderator of the degree to which collaboration was exhibited.
User satisfaction. This was measured by 7-point Likert ratings given by the participant in the post-test questionnaire.
These five metrics were recorded using the three data collection methods specified in the previous section. Effectiveness and efficiency were system measured, collaboration was
observed by the test moderator and user satisfaction was recorded by the participant’s subjective responses.
5.4.4 Instruments and Location
The experiment took place in the CoE Lab of the Department of Computing Sciences at the Nelson Mandela Metropolitan University. The test was performed on the Co-IMBRA prototype running on the Telkom CoE/NMMU Multi-touch Surface. The test was monitored by an overhead camera recording audio and video. The test moderator was present to observe and guide the participants where necessary. The test moderator also had access to the video and audio recording after the experiment.
5.4.5 Participants
Three important factors were considered in the selection of participants. These were the participant profile, the composition of the teams and the number of participants that were required.
Profile. Typical user testing of multi-touch systems used academic or professional
computer users in pairs who had a pre-existing relationship. The participant profile of the user test is intended to approximate the intended end users. Since Co-IMBRA is generally intended for research a sample composed of students from the Department of Computing Sciences and other Departments at the University was justified. The scenario used in the test represents a typical use of the system by a University student. Participants of the system were expected to have basic computer literacy. The participants were required to have completed a basic computer literacy course, such as the compulsory course completed by all students at NMMU, or to declare themselves computer literate. In addition, participants were all required to be over the age of 18.
Teams. As Co-IMBRA is a collaborative system, it was necessary to have participants
working in teams. Participants completed the tasks in groups of two. It was required that users of the system be at least briefly acquainted, as this is representative of typical use of a CIR system (Morris and Horvitz, 2007).
Number of participants. Fifteen groups were used in the evaluation, making a total of
thirty participants. Usability and satisfaction were assessed taking the participants’ responses individually (n=30), while performance and collaboration were assessed per
team (n=15). This represents a similar or slightly larger sample size to those utilised in the evaluation of various multi-touch systems discussed in Section 5.3.2.
5.4.6 Tasks
The participant teams were provided with a task list (Appendix E) that described the scenario as well as a list of 14 tasks that had to be completed in order to achieve the overall goal of the scenario. Each team member was provided with an identical copy of the task list but they were instructed that they needed to work together to achieve the list of tasks. That is, the team needed to decide ad hoc who was to perform which tasks or who had which role in the team.
5.4.7 Questionnaires
Two questionnaires were distributed to each of the participants, namely a pre-test biographical questionnaire (Appendix D) and a post-test questionnaire (Appendix F). The pre-test questionnaire captured some biographical details of the participant. This questionnaire was based on the Common Industry Format (CIF) for usability testing and was used to collect demographic and experience details for each participant. The post-test questionnaire was adapted from the Questionnaire for User Interface Satisfaction (QUIS) (Chin, Diehl and Norman, 1988). The questionnaire made statements paired with 7-point Likert scales. The Likert scales used the antonyms “Strongly Disagree” and “Strongly Agree” at either end of the scale. The adapted version contained some additional questions so the participants could rate the system on the support it provides for CIR functionality, resulting in the following 5 questionnaire sections:
Cognitive Load; Overall Satisfaction; Usability;
Collaboration; and General Comments.
The questionnaire was distributed to the participants after the test.
5.4.8 Test Procedure
undergraduate laboratories. Participants were expected to know each other or at least have met, rather than be complete strangers as this is more representative of typical use of a collaborative system.
The participants were shown the CoE Lab and the hardware. They were given a brief introduction to the research, multi-touch technology, the experimental design and the operation of the system. Appendix B shows the information that was presented to them verbally5. They were then given the pre-test questionnaire. Once the questionnaire was completed, the test moderator started Co-IMBRA, entered the participant numbers into the system and started the recording equipment. As far as possible they were allowed to complete the tasks entirely on their own. Gentle encouragement to participate was given in the few cases where one user dominated. Tasks were explained in cases where the participants did not understand the wording. They were not corrected when they performed a task incorrectly. They were also not assisted with division of workload and were told to discuss any problems that arose between themselves. After the task list was completed, the test moderator presented each group member with the post-test questionnaire. After the questionnaire was completed, the group was dismissed.
5.4.9 Statistics
The data collected from each user test was captured into an Excel spread sheet for analysis. The data was stored per individual. The NMMU Unit for Statistical Consultation provided advice and assistance with deriving the statistics.
Excel was used to produce descriptive statistics, such as the mean, median and mode, to provide a good picture of the data and demonstrate general trends. Cronbach’s alpha was calculated for each questionnaire subsection to confirm the reliability of the responses.