Data description - Study-3: Pilot runs - Shah_unc_0153D

3.4 Study-3: Pilot runs

3.4.2 Data description

This subsection provides description about various forms of log data collected during the pilot runs. Each piece of data is analyzed considering an individual as the unit, as well as a pair as the unit. While describing individual data, the unique and overlapping objects are computed for that particular individual during the session. For example, overlapping queries for an individual show the number of queries that the same individual re-used. On the other hand, while describing pair data, the unique and overlapping objects are computed within that pair. In the same example, overlapping queries are the queries that were re-used by either of the members of that pair that at least one of them had used before.

The participants (36 over three runs) used a total of 281 queries, with 163 unique queries. The distributions of number of queries for individuals and groups are shown in Figures 3.5 and 3.6. Both of these distributions are summarized in Figure 3.7.

Based on these figures, we can see that there is quite a bit of overlap in queries for individuals as well as groups. A very few individuals never repeated a query, but while considering the group as the unit, we can see that every unit had some extent of query re-usage.

Figure 3.5: Query distribution for the individuals. Consecutive numbers are pairs (e.g., 1-2, 3-4, etc.).

Figure 3.7: Query distribution summary.

The experimental version of Coagmento was designed to retrieve only up to 100 documents per query.9 It was found that over all the runs, the users saved a total of 318 documents, consisting of 89 unique documents. Figure 3.8 and 3.9 present the distributions of documents viewed by individuals as well as groups. These distributions are summarized in Figure 3.10.

As we can see, these distributions relating to the document viewing are very close to the distributions of queries. We can say there was a high correlation between using a query and viewing a document. However, unlike queries, we find less re-usage when viewing documents. Several individuals did not have any overlapping documents, which means after visiting a document, they never went back to it. Looking at the distribution for the groups, we find two of the groups (1 and 17) never revisiting a single document. From log mining and discussions with the participants, it became clear that the participants rarely looked at a document already viewed by their teammates. Given the nature of the system, the participants had to start with a query to get a list of

Figure 3.8: Viewed documents distribution for the individuals. Consecutive numbers are pairs (e.g., 1-2, 3-4, etc.).

Figure 3.10: Viewed documents summary.

documents to look at. At the level of query formulation, the participants in the same team may use the same or similar queries, but once they get a list of results, they would avoid looking at each other’s documents. This has two implications. First, if a task is time-bound, exploratory, and easily dividable, the participants may try not to do overlapping work. They may work individually trying to get as much information as possible, and then combine with their collaborators’ individual information to create the group’s product. Second, in order to easily know what has already been done by one’s self and/or others in the group, the interface needs to provide ready support. This reaffirms the value of awareness in collaborative projects.

Of these viewed documents, the participants saved 99 documents, consisting of 44 unique documents overall. The distributions for the saved documents for individuals as well as groups are shown in Figures 3.11 and 3.12, and their summaries are given in Figure 3.13. From these distributions, we can see that, in most of the cases, the number of total documents was the same as the number of unique documents that were saved. This makes sense as, typically, one would want to save a document only once. For some reason, some of the participants saved the same document more than once.

It is possible that they accidentally pressed the ‘Save’ button multiple times for the same document.

Figure 3.11: Saved documents distribution for the individuals. Consecutive numbers are pairs (e.g., 1-2, 3-4, etc.).

We could also see some of the individuals affecting the group statistics. For instance, group 11 consisted of individual users 21 and 22. The group used a total of 31 queries, of which 24 were issued by user 21, almost 6 times as many as his teammate (user 22). Such disparity in individual performance within a given group can have significant impact on group productivity as well as an individual’s perception about the success of the collaboration. Further discussion of this topic will be presented in Section 3.4.3 using participants’ responses on post-task and post-session questionnaires.

We can also see that nearly half of the documents that were viewed were saved. This can indicate user perceived relevance, taken with the caveat that there was little

Figure 3.12: Saved documents distribution for the groups.

cost to saving potentially relevant documents in a study where the participants did not have to do further processing.

The participants did not seem to be using the ‘keep for discussion’ feature much, where they can put a document for discussion/review with/by their partners. Overall, only 15 users used this feature and only 34 (24 unique) documents were kept for discussion, and none of them were transferred to ‘save’ or ‘discard’. Interestingly, ‘user08’ once transferred an already saved document to ‘discussion’, but that document was not discussed later and it stayed in the ‘discussion’ box. Such a behavior should not be surprising, given that the participants had only about 10 minutes to finish their tasks. Coagmento allows the user to collect snippets by highlighting portions of a displayed document and clicking on the ‘Snip’ button on the interface’s toolbar. The distributions for collected snippets for individuals and groups are given in Figures 3.14 and 3.15, and their summaries are presented in Figure 3.16.

Figure 3.14: Snippets distribution for the individuals. Consecutive numbers are pairs (e.g., 1-2, 3-4, etc.).

Figure 3.15: Snippets distribution for the groups.

While most of the participants liked this feature, not many snippets were collected. This could be a function of the task, the expectations from the participants, the time given, and the learning curve. It is highly likely that, if they were required to use the collected snippets later for preparing a report or doing a similar writing task, they may have collected more snippets.

Chat was the “trickiest” feature of this interface. This is mainly because of the fact that people are more used to chat/IM than any of the other tools provided. They had certain expectations while using this feature, which were not quite met. A total of 216 chat messages were sent between the partners. The distributions of the chat messages for individuals and groups are shown in Figures 3.17 and 3.18, with the summaries in

Figure 3.16: Snippets summary.

Figure 3.17: Chat messages distribution for the individuals. Consecutive numbers are pairs (e.g., 1-2, 3-4, etc.).

Figure 3.18: Chat messages distribution for the groups.

Analyzing these chat messages, it became clear that most of the communications were not regarding discussion of the documents, but on more basic issues. For instance, one user asked his partner why he could get only 100 results. Many messages were simply funny remarks or casual conversations not directly related to task completion.

In document Shah_unc_0153D_11239.pdf (Page 142-153)